Population characterization
To effectively investigate the CVA RNA structure, a population that was neither genetically highly structured nor interrelated was selected. Using SNPs (single nucleotide polymorphism, SNP), principal component analysis (PCA) was performed and a phylogenetic tree was constructed to quantify the population structure of the 75 variants. On the basis of the phylogenetic tree and PCA analyses, five clades were classified—types 1-4 and Admix (Figure 1). In the phylogenetic tree, the Admix clade was composed of disordered variants, but the CVA types 1-4 were distinct. Thus, the four CVA types were used for further analysis. Types 1 and 2 were on two near branches and were the same as types 3 and 4, indicating that types 1 and 2 had a genetic model similar to types 3 and 4.
Nucleotide Diversity In Different Groups
The fine-scale maps for nucleotide diversity (π) of the four types and all CVAs showed great variation along the whole genome (Figure 2). The π values of all CVAs were strikingly different from the four types, indicating that the CVA classification was reasonable. Some regions had a higher π value in one type (mainly types 2 and 3) than in the other three types. These regions, particularly 5′-UTR and 3′-UTR, showed induced mutations in a certain type. The π values of 5′- and 3′-UTRs were low in all types. The initial sequence of CDS2 had the same state. Therefore, these three regions were highly conserved, implying that they played important functions.
Diverse Motifs Of The Four Cva 5′-utr Types
Gene expression was controlled by the 5′-UTR structure determined by nucleotide sequence. The MEME program was used to predict the motifs of the 5′-UTR sequences in the four types (Figure 3). Almost all motif sites were one base, and the frequency of the base was 100%. There were three motifs in types 1, 3, and 4, and two motifs in type 2. The structures of the four CVA 5′-UTR types comprised different motifs, and differences in the four types induced different structures.
Secondary Structures In The Four Cva 5′-utr Types
The RNA secondary structures were predicted through the conserved motif sequences using the Mfold software (Figure 4). Type 1, 2, and 4 structures were formed by two stem-loops, the type 3 structure formed by only one stem-loop, but they showed evidence for phylogenetically conserved terminal stem-loops. The sequences of the common loop that probably recruit ribosomes were dissected, and their compositions were slightly different. On analyzing the characteristics of the secondary structures of 5′-UTR in the four CVA types, CVA 5′-UTR was found to contain a conserved sequence and RNA structure region (CR), CUUUACAGAGCUUCCCAACUGUAAAG, that forms a stem-loop in which the under lined bases are paired. And its loop is particularly rich in pyrimidine. Thus we hypothesized that the polypyrimidine sequences could interact with a region of 18S rRNA to enhance translation activity, as was previously found for Tomato bushy stunt virus, Barley yellow dwarf virus, Turnip crinkle virus and Triticum Mosaic Virus [21–23].
Translation Enhancement Activity Of Cva 5′-utr
To evaluate the effects on translation, the four CVA 5′-UTR types were inserted upstream of the FLuc gene (Figure 5A), and in vitro translation of the corresponding transcripts was evaluated using WGE. These four 5′-UTR types enhanced the translation of FLuc by 2-3 folds in the absence of the 5′cap (Figure 5B), which showed cap-independent translation enhancement activity. In addition, the effect of conserved RNA structure on translation was analyzed through mutagenesis of the FLuc reporter constructs (Figure 5A). Results show that for the loop of CR, the mutation of U35U36C37C38C39 to AAGGG reduced the translation to 47% of that of F-CVA-5U-T1 (Figure 5B). However, the mutation didnot reduce translation to a level comparable to that of the FLuc, that may be because CVA 5′-UTR has some polypyrimidine-rich repeats.
Structure Solution Probing Of Cva-5′utr-type 1
Although the secondary structures of CVA 5′-UTR have been predicted, we set out to evaluate the conserved RNA structure directly. Owing to the CVA 5′-UTR regions were highly conserved, we select type 1 to identify the conserved RNA structure by in line probing assay. In-line probing reports on the spontaneous cleavage of the RNA backbone mediated by 2′-hydroxyl that are geometrically in-line with oxyanion leaving groups on backbone phosphates. Such in-line geometry occurs primarily in nonstructured regions of RNA, where nucleotides are not torsionally constrained by hydrogen bond pairing. Based on Mfold and in-line cleavage pattern, CVA-5′-UTR-type 1 has two hairpins (Figure 6B). Most residues on the loop of CR were more susceptible to cleavage, and it showed that the conserved sequence region contains a stable stem-loop (Figure 6A and B).