Sequence assembly and annotation
To explore the transcriptome response to viral infection by vector and non-vector aphids, we performed RNA-Seq analysis on adult S. graminum and R. padi exposed to BYDV-infected, WDV-infected and uninfected wheat plants. The S. graminum and R. padi cDNA libraries were sequenced, which generated 72,092 and 68,996 unigenes, respectively (Table 1). The N50 sizes were 1,481 kb and 1,497 kb, respectively. The Q30 of each transcriptome library was above 85.74% and 85.62% for S. graminum and R. padi, respectively.
Nr (non-redundant protein sequence database) homologous species distribution revealed that 71.64% of S. graminum sequences matched with Acyrthosiphon pisum (Fig. 1 A), and 67.81% of R. padi sequences matched with A. pisum (Fig. 1 B). For the Gene ontology assignment, 7263 and 8164 unigenes in S. graminum and R. padi could be annotated. For S. graminum, 6200 unigenes were associated with biological process, 3080 with cellular component, and 6344 with molecular function (Fig. S1 A). In R. padi, 7104 unigenes were associated with biological process, 3597 with cellular component, and 7173 with molecular function (Fig. S1 B). These sequences have been submitted to the NCBI Sequence Read Archive (SRA), and the accession number is PRJNA490258.
DEGs in S. graminum and R. padi, as the vector/non-vector, in response to feeding on virus-infected wheat plants
For S. graminum fed on BYDV-infected wheat, 1525 DEGs were identified, with 693 upregulated and 832 downregulated. There were fewer DEGs in the S. graminum fed on WDV-infected wheat (494), and the majority were downregulated (73.28%). There were 589 DEGs in R. padi exposed to BYDV-infected wheat, with 247 upregulated and 342 downregulated. A total of 343 DEGs were identified in R. padi exposed to WDV-infected wheat, with 75.8% downregulated. The change ratios of most DEGs were between 2-5 and 25 (Fig. 2). Gene ontology (GO) and KEGG enrichment analyses were conducted for all DEGs.
Distribution and enrichment of DEGs in vector and non-vector aphids into gene ontologies (GOs)
Gene ontology (GO) analysis of all the DEGs in S. graminum and R. padi was conducted to classify the sequences into the biological process, molecular function and cellular component categories (Fig. S2–Fig. S5). The most enriched category in both vector and non-vector aphids was metabolic process (Fig. S2–Fig. S5). Among all comparisons, the differentially expressed genes contained a high proportion of genes involved in development, metabolism, reproduction and growth, and the number in vector aphids was higher (29 DEGs in Sg-BYDV) than that in non-vector aphids (13, 13, and 3 DEGs in Sg-WDV, Rp-BYDV, and Rp-WDV, respectively) (Fig. 3 and Table S5). In addition, genes associated with the cuticle (structural constituent of the cuticle and structural constituent of the chitin-based cuticle) were also differentially expressed, and there were 16 DEGs in the vector treatment (Sg-BYDV), while there were fewer in the non-vector treatments (with 2 in Sg-WDV, 5 in Rp-BYDV, and 2 in Rp-WDV). In Sg-BYDV, there were 15 DEGs associated with the cuticle, and the majority of them were downregulated (14 downregulated) (Fig. 3 and Table 2). Several genes involved in the cytoskeleton were also differentially expressed, with 6, 1, and 2 related DEGs in Sg-BYDV, Sg-WDV, and Rp-BYDV, respectively (Fig. 3 and Table 3). The results of GO enrichment analysis showed that the number of categories of DEGs enriched in vector aphids was higher than that in non-vector aphids (Fig. S2–S5). The COG enrichment analysis showed a similar pattern in that the number of categories in the vector aphids was greater than that in the non-vector aphids (Fig. S6–S9).
KEGG pathways analysis
To further understand the metabolic pathways that contained differentially expressed genes, KEGG significant enrichment analysis was performed on the differentially expressed genes between virus-treated vector and non-vector aphids. Among all of the DEGs, there were 287, 74, 91 and 27 DEGs that were assigned to KEGG pathways in Sg-BYDV, Sg-WDV, Rp-BYDV and Rp-WDV, respectively (Fig. S10–S13). The DEGs of the richest and shared pathways were associated with metabolism, including translation and fatty acid and carboxylic acid metabolic pathways. Genes involved in energy production pathways may be important for immune defense responses [27]. In the KEGG pathway analyses, we also identified DEGs involved in immune pathways.
Immune-related DEGs of vector and non-vector aphids feeding on virus-infected wheat
Feeding on virus-infected plants resulted in DEGs related to the innate immunity pathway being upregulated. The DEGs related to immunity included several members of the pathway of the lysosomes, the JAK-STAT signaling pathway, antigen processing and presentation, ubiquitin mediated proteolysis, and the peroxisome (Table 4). For the DEG analyses, we identified 12 DEGs involved in immune pathways in the vector aphids (Sg-BYDV), the majority of which were upregulated (11 DEGs). For the non-vector aphids, there were fewer DEGs related to immunity, and only one DEG involved in immunity was identified in each of the Sg-WDV, Rp-BYDV, and Rp-WDV conditions (Table 4).
RT-qPCR
RT-qPCR was used to confirm the expression of 14 selected DEGs of S. graminum and R. padi. These genes were selected because we were interested in their function, since they were involved in immunity, development, growth and reproduction. Based on the transcriptome data, ten unigenes, namely, Sg35675 (Synaphin protein), Sg36590 (Immunoglobulin domain), Sg40066 (Glycosyl hydrolases family), Sg43786 (plexin A3-like), Sg48920 (Immunoglobulin domain), Sg51592 (transcriptional regulator CRZ2-like), Sg53134 (Cadherin domain), Sg56027 (Repeat in HS1/Cortactin), Sg30601 (AAA domain), Sg35800 (PDZ domain) were highly up- or down-regulated in virus-exposed S. graminum. The remaining six unigenes, Rp22910 (apolipophorin-3-like precursor), Rp22931 (cuticular protein 68 precursor), Rp46740 (lysosomal alpha-mannosidase-like), Rp49092 (STAT protein), were highly expressed in virus-exposed R. padi. The results of RT-qPCR showed that all 14 selected genes showed the same expression patterns (upregulated or downregulated) as the RNA-Seq analyses (Fig. 4).
Cloning and characterization of STAT5B in R. padi
For the immunity related genes analysis, STAT5B (signal transducer and activator of transcription 5B) was the only related gene that was upregulated in R. padi fed on WDV infected wheat. STAT5B is a key gene involved in the JAK-STAT pathway, which is important for the innate immune response of insects. Based on the sequence of the transcriptome, we cloned STAT5B from R. padi (designated as RpSTAT5B). The cloned product was validated using RT-PCR and sequencing. The sequencing results of the RpSTAT5B individual isolated colonies confirmed a single transcript identical to that found in the transcriptome sequencing. We have submitted the sequence of STAT5B to the NCBI GenBank database (accession number MK931299). The cloned product of RpSTAT5B is 2364 bp, and it has a complete open reading frame of 2082 bp encoding 693 amino acids, with a molecular weight of 83.2 kDa and a theoretical isoelectric point of 8.31.
Multiple protein sequence alignment of RpSTAT5B with six STAT5B proteins from other insects was conducted. Structural analysis showed that the RpSTAT5B protein has the typical structural features of the STAT5B family (Fig. 5). Phylogenetic analysis indicated that the amino acid sequence of the RpSTAT5B protein showed high identity to that of the corresponding proteins from other aphids of Homoptera, particularly Rhopalosiphum maidis (Fig. 6). Both the RT-qPCR and transcriptome results showed that RpSTAT5B was upregulated in R. padi fed on WDV infected wheat (Fig. 4), suggesting that RpSTAT5B is critical to the response to feeding on infected wheat plants. However, the mechanism of RpSTAT5B in aphid immune defense remains to be further studied.