In plants, G-type lectin is a big gene family that is believed to play roles in biotic and abiotic stresses [44, 45]. Their role in defense was also reported in strawberry. For instance, 34 G-type LecRK genes were found upregulated in F. vesca root after P. cactorum inoculation [46], and the G-type lectin gene FaMBL1 was found involved in F. x ananassa resistance against C. acutatum [25]. A study about strawberry Serine/Threonine Kinase disease resistance gene family showed that many Serine/Threonine Kinase genes belong to G-type LecRK [47], but insights about the genomic organization of G-lectin proteins in strawberry was still limited. Recently, high-quality F. vesca genome annotation provided a good chance for the genome-wide study of G-lectin genes in F. vesca.
To identify G-lectin encoding genes, we used only sequences of the GNA domain as a query rather than the whole sequence. This choice was made to avoid using the kinase domain sequence as a query, which would lead to much ambiguity in G-lectin identification. Eventually, 133 proteins were found belonging to the G-lectin family in F. vesca and the majority (102 out of 133) of G-lectins contained kinase domain belonging to the G-LecRK class. Four genes containing both GNA and kinase domain, but lacking TM domain, were classified into G-LecK. The lack of TM domain may lead to function alteration of these G-lectins.
TM domains are required for the plasma membrane localization of G-LecRKs [4, 48]. In rice, a single amino acid substitution (Ile144Met) in the TM domain of Pi-d2, a rice G-LecRK conferring resistance to M. grisea strain ZB15, made the plant susceptible to the strain ZB15, suggesting that the TM domain of Pi-d2 may participate in the ligand recognition and signal transduction [4]. Indeed, the substitution did not change the plasma membrane location of Pi-d2, so the altered structure of the mutated TM may have lost or modified its ligand-binding function and signal transduction from the extracellular domain to the intercellular kinase catalytic domain [4]. This fact implies that the TM domain of G-lectin has a role in both membrane localization and signal transduction. However, most of G-LecPs in F. vesca, despite lacking TM domains, were also predicted to anchor to the plasma membrane. In this regard, a pepper G-LecP, CaMBL1, consisting of GNA domain and PAN domain and regulating plant defense to bacterial X. campestris pv vesicatoria, was reported to be located on plasma membrane [14]. Moreover, the transient expression of CaMBL1 induces the accumulation of salicylic acid and the activation of defense-related genes, which indicates a role in defense signaling, although without TM and kinase domain [14]. These data show that despite most of the previous studies on G-lectins focused on G-LecRKs, studies on G-LecPs could also cover important functions in plant.
The kinase domain of lectin receptor kinases could interact with downstream signaling molecules and display its catalytic activity [48]. The expression analysis of G-LecRKs with mutations in their kinase subdomain reveal expression for some of these lectins in spite of a predicted loss of kinase activity. Studies should be carried out to establish the importance of each amino acid residue in the kinase domain activity so to prove a relationship between conserved motifs and G-LecRK function.
Except for GNA, TM, and kinase domains, G-Lectins also contain some of SLG, PAN, and EGF domains. The various domain arrangements of G-lectins create an enormous degree of protein diversity. Proteins consisting of arrangements with PAN and SLG domains have GO functions related to the recognition of pollen, protein phosphorylation, and cell recognition which make these proteins important in reproduction and in general in signal perception or/and transduction [49]. Multiple domain proteins are more species-specific compared with single-domain proteins, which are commonly shared among many plant species [49]. In F. vesca, more than 90% of G-type lectins were found to belong to multiple domain proteins. These species-specific domain arrangements might be a consequence of frequent duplication events followed by lineage-specific retention [50]. This is consistent with our result where a big portion of F. vesca G-lectin genes appear to originate from duplication and various domain arrangements. The various domain arrangements of G-type lectins could be considered as a kind of flexible genetic mechanism to produce species-specific adaptation to changing environments [49].
Tandem and dispersed duplication significantly contribute to the expansion of the G-lectin gene family in F. vesca. More than half of G-type lectin genes of F. vesca originate from duplication events. Chromosome 3, where the highest number of G-type lectin genes is located, showed a big number of duplication events of G-type lectin genes. Conversely, no tandem duplication event on G-type lectin genes on chromosome 4 and chromosome 7 was found and these two chromosomes also contain fewer G-type lectin genes than other chromosomes. Species-specific expansion of the G-type lectin gene family was also reported in a study about lectins in soybean, rice, and Arabidopsis where tandem and segmental duplications have been regarded as the major mechanisms to drive lectin expansion [30]. Consistently, a study about lectin genes in cucumber also revealed that 106 out of 146 genes (76.8%) were involved in the tandem duplication events [34].
According to the transcriptome data, many G-lectin genes, no matter G-LecRKs or G-LecPs, are actively expressed on different tissues at different developmental stages of strawberries. G-lectins in F. vesca actively respond to pathogens, abiotic stress, and elicitors; and some G-lectin genes appear to respond to both biotic and abiotic stress. Up to now, only one G-lectin gene, FaMBL1 (homolog of FveGLP6.4) was studied for its involvement in resistance against pathogens in strawberries [25]; however, the molecular mechanism underneath is not yet elucidated. FveGLP6.4 appears to be not only expressed in several tissues of strawberry during its development but also found upregulated after challenges by B. cinerea, P. aphanis, and P. cactorum (Fig. 7 and Fig. 9) pathogens, implying the involvement of FveGLP6.4 in F. vesca (or FaMBL1 in F. x ananassa) in plant defense.
The molecular features of some G-type lectins from other plant species are better known: Pi-d2 [4], LORE [21], OsSIK2 [26], and CaMBL1 [14], which could regulate plant defense responses, were proved to be located at the plasma membrane by using confocal microscopy. For CaMBL1, its ability of mannose affinity and the importance of GNA domain for its localization are known [14]. According to the study, CaMBL1 has affinity toward Manα and/or Manβ and GalNAc residues, and GNA domain is essential for its binding to D-mannose. A preliminary working model of OslecRK was also proposed by Cheng et al. [24]. Here sensing of biotic stress first stimulates OslecRK expression, followed by the interaction of its kinase domain with OsADF (actin-depolymerizing factor) to transduce the signals. Following these events, the expression of defense-related genes (PR1a, LOX and CHS) was induced to strengthen the plant’s immune response.
To further predict the function of G-lectins in F. vesca, we retrieved the genes predicted to co-express. G-lectin genes could co-express with other G-lectin genes, receptor kinase, and disease resistance genes which provides clues for uncovering their function. These data need to be proven experimentally.