Morphology of SG and cocoons
To investigate the arrangement of the SG in N. cornuta, we performed micro-CT imaging. Our 3D study revealed a lateral looping pattern in the SG. In the larval model of N. cornuta, the SG extended approximately to the middle of the body, terminating at the posterior level of the third abdominal segment (Fig. 1). The anterior part of SG, corresponding to what is typically known as the anterior silk gland (ASG) in the other moths and caddisflies, extends from the cephalic region to the end of the first thoracic segment (Fig. 1A). Subsequently, there was a distinct thickening of the SG, continuing up to the first abdominal segment. This corresponds to the middle silk gland (MSG) in other lepidopterans. There was a thin layer of cells on the SG surface surrounding the lumen with stored silk (Fig. 1D). In the first abdominal segment, the SGs expand further and form two loops: a shorter loop towards the head and a longer loop towards the lateral part of the body (Fig. 1). This region may correspond to the posterior silk gland (PSG) observed in other lepidopterans. This was marked by large secretory cells encircling the spacious lumen (Fig. 1E).
The silk stored in the SG lumen showed notable color differences when stained using Masson’s trichrome (Fig. 1B–E). The silk material in the ASG appeared red, whereas that in the MSG and PSG appeared dark blue. These color differences suggest that the silk material undergoes structural changes as it moves through the SG. Notably, this staining method did not show the expected differentiation between the fibroin core and the coating layers within the lumen (Fig. 1).
The cocoon of N. cornuta (Fig. 2) was oval-shaped and approximately 4 mm long. It featured a thin, compact wall that formed an almost impermeable envelope. The cocoon’s filaments, approximately 3 µm wide, were ribbon-like and tended to fuse with the adhesive layer, suggesting a high content of sticky material. Notably, the front part of the cocoon included chimney-shaped pores with diameters of 40–50 µm (Fig. 2A, B, C). These perforations likely formed through intentional circular movements of the larval head during spinning, allowing for environmental interactions with the pupa.
Detection of putative silk structural proteins
Transcriptome analysis was conducted to identify the major silk components of N. cornuta. First, we isolated the RNA and prepared an RNA-seq cDNA library. Transcriptomes of the entire larvae were constructed using de novo and genome-guided transcriptome assembly methods. The de novo transcriptome served as a proxy protein database. Cocoon proteins were digested with trypsin and the resulting peptides were identified using peptide mass fingerprinting by comparing the experimental and predicted MS/MS spectra. We discovered 113 proteins, 62 containing a signal peptide (Table S1). As shown in Table S1, these proteins were categorized into four groups. The first group (13 members) encoded potentially large structural silk proteins characterized by repetitive sequences. The second group comprised 24 members that encoded various enzymes (Table S1). The third group comprised proteins with homologs in other species but no established association with silk (nine members). The fourth group comprised 16 small proteins with unknown functions, some of which also exhibited repetitive sequences and likely contributed to the structural integrity of silk.
The first structural group was particularly notable regarding the composition of the N. cornuta cocoon. It comprised gene products organized into small clusters that encode putatively large structural silk components (Table 1). One of these clusters, located in the JAHKQU010000031.1 contig of the publicly available genome ASM2038319v1, included genes encoding sericin-like proteins (Srcl1, Srcl2, and Srcl3). These highly hydrophilic proteins had repetitive sequences with a significant proportion (21–57%) of serine residues. Srcl2 and Srcl3 were particularly large, with molecular weights of 933 and 1450 kDa, respectively. Additionally, four other genes in this group resembled caddisfly cadhesins 9. These genes were encoded by two exons and arranged in pairs on two genomic contigs (JAHKQU010000021.1 and JAHKQU010000101.1), with their coding regions split into two exons. Their protein products lacked sequences capable of forming crystal domains and did not exhibit sequence conservation with fibroins or sericins. The structural group also contained four zonadhesin-like genes on the JAHKQU010000031.1 contig that encode protein products between 29 and 130 kDa. These products had a high proportion of cysteine residues (14–15%) and conserved Til/EGF2 domains, suggesting their possible role in protease inhibition. Finally, the structural group also included two large genes (containing four and ten exons) encoding putative mucin-like proteins, Muc1 and Muc2, located as singletons on contigs JAHKQU010000032.1 and JAHKQU010000010.1, respectively. These proteins had hydrophilic, repetitive sequences with serine residues comprising 10% and 13% of the sequences, respectively.
N. cornuta silk lacks a close homolog of FibH
Silk proteomic analysis revealed that proteins similar to FibH were absent in N. cornuta. Thus, we performed a comprehensive survey of the transcriptomic and genomic sequences of N. cornuta using the BLAST algorithm with conserved regions of known fibH genes in Lepidoptera and Trichoptera as reference points. Our analysis also included the genome sequence of another micropterigid species, Micropterix aruncella. However, we failed to identify fibH-like sequences. These results reinforce the fact that the fibH may have been entirely lost from the genomes of both micropterigid sequences, or it may have diverged substantially, rendering it unrecognizable by sequence similarity.
The conservation of the genomic region harboring the fibH gene was investigated based on genome sequence analyses of several lepidopteran and trichopteran species, and it was found that the fibH gene is typically located between two genes, prospero (pros) and dihydrolipoyllysine residual succinyltransferase (DRSC). This arrangement is conserved among many lepidopteran and trichopteran species, ranging from, Nematopogon swammerdamelus (Adelidae), Incurvaria masculella (Incurvariidae) in Adeloidea, Nymphalis io (Nymphalidae) and B. mori (Bombycidae); also in two caddisflies suborders, including Hydropsyche tenuis (Hydropsychidae) and Parapsyche elsis (Hydropsychidae) in Annulipalpia, and Himalopsyche kuldschensis (Rhyacophilidae) in Rhyacophiloidea. In addition, our study revealed the relatively conserved presence of homologs for the additional two genes, ubiquitously expressed transcript (UXT) and repetitive organellar protein (ROP), in the same genomic region, as shown in Fig. 3.
There were some minor rearrangements within the genomic region, such as the opposite orientation of the UXT gene between caddisflies and moths, an inversion of DRSC in the ditrysian Lepidoptera B. mori and N. io, and the translocation of ROP in H. tenuis and P. elsis (Fig. 3). Notable conservation of the entire region was observed in most studied Lepidoptera and Trichoptera species. Our results also suggest that more extensive restructuring occurred in the two micropterigid genomes as UXT and ROP were relocated to different genomic regions, and the original DRSC copy was lost and replaced by an intronless copy elsewhere in the genome (contig JAHKQU010000032.1, in N. cornuta; chromosome 15 in M. aruncella). Although the 3′-terminal part of the DRSC gene remained at its original position in N. cornuta, the region between pros and the DRSC residue did not contain any sequence that could encode a larger repetitive protein such as FibH. Overall, fibH was absent in the synteny regions of N. cornuta and M. aruncella, whereas it was present in other species of both groups, supporting the hypothesis that the micropterigids lost their fibH gene secondarily.
Possible N. cornuta ortholog of fibL
Neither FibH nor FibL-like proteins were detected in the silk proteins of N. cornuta. To investigate this aspect further, we employed the BLAST algorithm to thoroughly examine the transcriptome and genome sequences of N. cornuta, as well as the genome sequence of M. aruncella. We identified a distantly related sequence resembling fibL, named fibX, and organized it into six exons. When comparing the protein encoded by fibX with other FibL proteins from Lepidoptera and Trichoptera, we observed that the similarity was notably low (Fig. S2). The fibX protein shares only 25–28% identity with other fibL genes. Notably, we did not find any fibL-like gene in the genome of M. aruncella.
To assess the conservation of synteny in the genomic regions harboring fibL in Lepidoptera and Trichoptera, we analyzed the corresponding sequences in several representatives of both taxa. We examined the genomes of four moths and one caddisfly (B. mori, I. masculella, M. aruncella, N. cornuta, and Himalopsyche kuldschensis). Our analysis revealed that in the species studied, the fibL gene is on the same contig/chromosome as several conserved genes, including dynein regulatory complex subunit 4 (DRC4) and histone deacetylase 11 (HDAC11). However, the order of these genes is not conserved, preventing us from identifying the exact location of FibL in micropterigids. Notably, the fibX gene in N. cornuta is located on the same contig as DRC4 and HDAC11, suggesting that fibX could be an ortholog of fibL, which has undergone significant divergence. Additional data from micropterigids are needed to confirm that fibX is an ortholog of fibL.
Loss of FibH and the loss or divergence of FibL is specific for Micropterigidae
Is the absence of FibH specific to the family Micropterigidae or does it also apply to other Lepidoptera? For the subsequent analysis, we chose two representative species from the family Eriocraniidae in the suborder Eriocranoidea, which is known by its basal split from other Lepidoptera 18. We generated transcriptomes of the final-instar larvae of both Eriocraniidae species and performed BLAST searches to identify homologous sequences of fibH and fibL genes. We successfully identified sequences similar to those of fibH and fibL in both eriocraniid species. The resulting alignments of putative FibH and FibL proteins with sequences of several homologs in both Lepidoptera and Trichoptera are shown in Figs. 4 and S1, respectively. Both proteins contain conserved sequences that are characteristic of fibroins. Our results suggest that the absence of fibH and substantial divergence or loss of fibL are specific to Micropterigidae.