High-throughput sequencing and rapid development of bioinformatic analysis have greatly facilitated the modern venomic research. In this work, we investigated the venom components of S. verrucosa by integrative analysis of genomic, transcriptomic and proteomic data. The chromosome level genome provides the fundamental genetic resource for venom components characterization and in-depth biomedical study. We first assembled the high-quality S. verrucosa genome, which enabled us to preliminary annotate a total of 476 toxin genes. Further transcriptomic analysis of the venom gland characterized 12 DTEGs as the potential major expressed toxic genes in the venom gland and 11 of them were eventually evidenced by the venom proteome.
Unique venom composition of S. verrucosa underlined its evolution adaption
A total of 467 toxin genes were annotated in the S. verrucosa genome, with peptidase S1 family, venom metalloproteinase family, the SNTX/VTX toxin family listed as the most abundant (Fig. 1C). Peptidase S1 family genes encode serine proteases, which are responsible for coordinating various physiological functions, including digestion, immune response, blood coagulation and reproduction. Snake venom serine protease (SVSP) was characterized as one of the four dominant protein families, together with phospholipase A2 (PLA2), snake venom metalloprotease (SVMP), and three-finger toxins (3FXs)[56]. In S. verrucosa genome, peptidase S1 family genes (102) were listed as the most abundant toxin genes in S. verrucosa genome (Fig. 1C). Interesting only two of them showed expression level with TPM > 100, and peptidase S1 genes as a whole accounted for only a small proportion of the VG transcripts. This was similar with the cases in scorpion, spider and centipede[57–59]. Correspondingly, only around 0.13% of peptidase S1 proteins were found in the crude venom (Supplementary Data 3). A total of 36 venom metalloproteinase family genes were annotated in our S. verrucosa genome, all of them showed low level expression with TPM < 50, and no metalloproteinase protein identified in the crude venom. Similarly, 6 PLA2 genes (TPM < 1) were barely detected in the VG transcriptome. Instead, neoVTX proteins from the 6 tandem-repeated genes were the most abundant toxin proteins in the venom (Fig. 3B). Though we characterized 12 DETGs in the VG transcripts, only 3 toxin proteins demonstrated abundance more than 1% in the crude venom, including a pair of neoVTX subunits, a CRVP like protein, and a Kuntiz-type serine protease inhibitor. The results indicated the venom is relatively less complex compared to other venomous animals, which was also evidenced in a similar study in S.horrida[25]. The last but not the least, we found 4 unannotated genes with high expression (TPM > 500) in the VG but no proteins in the venom (Fig. 2B), indicating their potential roles as unique genes involved in the VG function of S. verrucosa. All the above results suggested that the unique venom composition underlined the venom phenotype evolution of S. verrucosa.
Integrative analysis revealed contribution of neoVTX genes to venom diversity of S. verrucosa
The major lethal molecules of S. verrucosa venom has been purified and molecular characterized as early as 1990s. The VTX and neoVTX were purified by two different research group and characterized as different proteins and remained unclarified so far. The VTX was first characterized as a tetrameric protein composing of VTXa and VTXb subunits[30, 31], whereas the neoVTX was a dimeric protein composing of neoVTXa and neoVTXb subunits[60]. They were considered as different proteins mainly based on the difference of their deduced amino acid sequence, neoVTXb showed 90% identity compared with the reported VTXb, lower than that with the SNTXb from another species, S.horrida. Further analysis found that VTXb demonstrated an additional nucleotide at the 3’ end of the CDS, leading to a frame shit, which contributed to the major difference between them. It was suggested that the geographical variation of the fish in the two studies contributed to major loci difference between the two toxin genes[60]. In our assembled genome, the discriminative loci in the 3 tandem-repeated beta subunit genes were all consistent with the neoVTXb (GenBank: AB262393.1) (Supplementary Fig. 6). Interestingly, we did find two protein bands with different molecular weights close to the reported dimeric and tetrameric neoVTX proteins by native PAGE electrophoresis, which were both further confirmed to contain by mass spectrometry sequencing (Fig. 3D). The results indicated that neoVTX proteins were the major lethal toxins in the fish strain we used and they may present as either dimeric or tetrameric, even other unknown polymeric proteins.
Gene duplication have been proved to play pivotal roles in the evolution and diversification of toxin genes[61]. The major toxins in the snake venom, including metalloproteinases, phospholipases A2, and three-finger toxins, all experienced gene duplication events during the evolution[15, 62]. A total of 32 SNTX/VTX family genes were annotated in our assembled S. verrucosa genome. Integrative analysis with the venom gland transcriptome and venom proteome enabled us to evidenced that most of neoVTX proteins in the venom came from the six tandem-repeated neoVTX subunits, predominantly a pair of a2 and b3 genes (Fig. 3C). Whereas another two SNTX-like genes were also located adjacent to this region. They all showed the 4 conserved domains but with much lower sequence identity and were clustered to a separate branch in the phylogenetic tree (Fig. 1D). Additionally, the transcriptomic data showed that these two SNTX-like genes barely expressed in the venom gland, which was further evidenced that no protein was detected in the venom proteome. The results indicated that they may share the same origin with the neoNTX genes but showed no functional roles in S. verrucosa.
Alternative splicing of venom genes was considered as a complementary mechanism for the generation of venom complexity across animal lineages[19, 63, 64]. We found a total of 856 isoforms expressed in the venom gland from 198 toxin genes, and 411 of them including 29 trans-splicing isoforms were mapped to the 6 tandem-repeated neoVTX subunit genes, greatly increased the potential to produce diverse neoVTX protein. Expression analysis of these AS isoforms revealed that a pair of neoVTX a2 and b3 genes transcribed the several highest expressed isoforms (Fig. 4E). Interestingly, though we identified a total of 29 trans-splicing isoform among the six tandem-duplicated neoVTX genes, all of them showed much lower expression level (Fig. 4D and E), leaving their functional roles and biological significance unknown. Within the crude venom, 94.13% of the SNTX/VTX proteins were mapped to a pair of a2 and b3 genes, whereas transcriptomic data showed that the expression of them only accounted for 77.50% of all the 6 tandem-repeated genes (Fig. 3C), indicating an unknown regulation mechanism to biasedly promote translation of neoVTX a2 and b3 existed in the venom gland. Unfortunately, our proteomic sequencing approach failed to distinguish these proteins with various sizes and charges. A 2D gel electrophoresis of the crude venom from the closely related species S. horrida revealed 15 SNTXa isomers and 26 SNTXb isomers with varied length of ORF, partially supported the alternative splicing of neoVTX genes in this study. The results indicated that the AS of the toxin genes were conserved mechanism in stonefish underlined venom diversity[25]. Further work is required to investigate the biological significance of these neoVTX isomers.
The other major toxin proteins characterized in the venom of S. verrucosa
The most abundant toxin protein except the neoVTX proteins was characterized as CRVP in this study, whose homologue in S.horrida was defined as Golgi-associated plant pathogenesis related protein (ShGAPR) instead. Though ShGAPR showed the higher sequence similarity with the human GAPR1, it was suggested to act as a venom toxin similar to CVRPs[25]. The transcriptomic analysis of S. verrucosa venom gland revealed that, sv_crvp1 was found to be the highest expressed toxin gene in the VG and categorized into DEG compared to SK (Fig. 2B and C), further supporting its potential role as toxin gene in S. verrucosa. The most homologous of SvCRVP annotated in the ToxProt was helothermine, which was first isolated from the venom of the Mexican beaded lizard (Heloderma horridum horridum)[65]. Lizard helothermine was a peptide toxin could block ryanodine receptors, which were responsible for Ca2+ release within skeletal, cardiac and neuronal cells[66, 67]. Further work is encouraged to illustrate the potential role of SvCRVP as the most abundant venom component except neoVTX proteins in S. verrucosa.
Protease inhibitors in the venom were widely utilized by snakes as weapons to disrupt the homeostasis of prey’s physical biochemical reactions, such as the blood coagulation and blood pressure regulation, which resulting immobilization or death of the prey[68]. Besides, protease inhibitors were hypothesized to play roles in protecting peptide/protein toxins in the venom from degradation by proteases from the prey or predators[69]. Kuntiz-type serine protease inhibitors, C-type lectins and Phospholipase A2 were the most extensively distributed protease inhibitors in the snake venom[70]. We annotated 17 Kunitz-type family genes in S. verrucosa genome and only one of them (sv_kspi1/SvKSPI) were characterized as DETGs in the venom gland (Fig. 2B and 3B) with an abundance of 1.84% in total toxin proteins. Besides, homologues of Snaclec B1 and SLXA belonging to C-type lectins were also found in the crude venom (Fig. 3B), together with SvKSPI served as the major protease inhibitors in S. verrucosa.
Calglandulin, previously identified from the venom gland of Bothrops insularis, was characterized as a putative Ca2+ binding protein with four EF-hand motifs[71]. Using antiserum against the recombinant calglandulin, it was found that calglandulin was only detectable in the venom gland of Bothrops insularis, but not in the crude venom or other tissue, indicating that Calglandulin may be primarily involved in the cellular control mechanism of the secretion of toxins from the gland into the venom[71, 72]. Whereas Calglandulin was screened out as a major component in the crude venom of fire ant Solenopsis invicta[73] and stingray[74], indicating its unknown functions beyond the toxins secretion. In the S. verrucosa venom, a small proportion of Calglandulin proteins (0.60%) were also evidenced, indistinguishably corresponding to the two sv_caglp1 and sv_caglp2 due to their same deduced amino acid sequence (Supplementary Fig. 7). Notably, transcriptomic analysis revealed that only sv_caglp1 gene was characterized as DEGs between venom gland and the skin, indicating a transcriptional regulation mechanism underlined driving sv_caglp1 as the major one to exert the function in toxins secretion (Fig. 2B and C).
Hyaluronidase is one of the enzymes commonly seen in the venom of snake, scorpions, spiders and leeches[75]. Hyaluronidase activity is considered critical for the spreading of toxins and interrupting the integrity of target’s extracellular matrix through the degradation of hyaluronic acid, which have been considered as a therapeutic target against snake venom[76, 77]. The cDNA of S. verrucosa hyaluronidase has been previously cloned and characterized with a conserved active site composing of one catalytic residue and four substrate positioning residues[32]. In the present study, we characterized hyaluronidase gene as one of DETGs in the venom gland and further supported by the proteomic data (Fig. 2B and 3B).
Among all the 12 DETGs defined in the S. verrucosa venom gland, only the sv_tpt1 gene demonstrated no proteomic evidence in the crude venom (Fig. 3B). Sv_tpt1 gene encodes a Translationally Controlled Tumor Protein (TCTP), which was predominantly described as venom toxins from spiders, contributing the allergic and inflammatory activity of the venom[78, 79]. Few studies paid attention to the TCTP toxins in other venomous animals, leaving its biological function under investigated. The expression of TCTP gene was reported in the venom gland from three scorpionfishes (Scorpaenidae.spp), indicating its widely distribution in the venomous teleost[80]. In the S. verrucosa, sv_tpt1 gene showed comparable high expression level both in the VG and SK (Fig. 2C), together with the absence in the crude venom, indicating its potential role beyond as toxins in the S. verrucosa.
A group of toxin proteins beyond the 12 DETGs were also identified based on the abundance in the crude venom (Fig. 3B). A putative endothelial lipase first found in venom gland transcriptome of the eastern diamondback rattlesnake, were suggested to have phospholipase and triglyceride lipase activities[81]. Lactotransferrin protein was a major iron-binding protein usually found in exocrine fluids including mucosal secretions and mammal breast milk. Lactotransferrin was annotated as toxins as its presence in the proteome of venom accessory glands in the vampire bat (Desmodus rotundus) and acted as antimicrobials[82]. Similarly, peroxiredoxin-4[83], Snaclec B1[84], Cobra venom factor like protein[85] were also evidenced as venom proteins in other venomous animals. The presence of these proteins in the crude venom of S. verrucosa further supported their characterization as a toxin protein, whereas the molecular mechanism underlined need to be further investigated.
A notable non-toxin protein in the venom
A set of non-toxin proteins were shown in Fig. 3B. Though some of them were never reported to be present in the crude venom, they were hypothesized to imply housekeeping or hemostasis maintaining roles in the VG. Among them, a gastrotropin-like protein was found to be the second most enriched followed neoVTX proteins, encoded by the FABP6 gene, which also showed high level expression in VG. (Fig. 2B and 3B). Gastrotropin/FABP6 is a fatty acid binding protein and therapeutic target related to immune infiltration[86]. Gastrotropin was first characterized as venom protein in a S. horrida, but not any other venomous animals. It was suggested to contribute the unique activities of its venom, by maintaining venom hemostasis and protecting venom components from degradation [25]. We did find that the extracted crude venom of S. verrucosa remained high activity at room temperature after hours. Further work is encouraged to investigate the biological significance of gastrotropin-like protein in S. verrucosa venom.