We have searched for the presence of calpains in proteomes of 50 cyanobacterial species and we have identified calpains in 10 of them based on HMM of the catalytic CysPC domain typical for calpains proteins. The number of identified cyanobacterial species possessing calpains is relatively low, but as it has been shown previously, cyanobacteria are a highly diverse group and their genome content varies significantly even at the species and strain levels (Mohanta et al. 2017). CysPC domain is in cyanobacteria often associated with PPC domain (Table 1, Fig. 2), which is typically present in bacterial secreted proteins at their C-terminus (Yeats et al. 2003), while in cyanobacterial calpains, it is found at the N-terminus. The transmembrane helical regions are absent from all putative cyanobacterial calpains suggesting their cytosolic localization. These findings are consistent with the study of calpains in other bacteria that also possess PPC at the N-terminus and do not possess any predictable transmembrane regions (Rawlings 2015).
Calpains are known to be involved in many cellular processes in multicellular eukaryotes such as aleurone bilayer development and positional cell division in plants (Olsen et al. 2015), and brain function, memory formation and the development of many pathological processes in mammals (Ono et al. 2016). Calpains cleave a wide range of substrates, among which are e.g. protein kinases, receptor molecules and proteins involved in signal transduction. It has been proposed that calpains play main role in regulation of cell signalling rather than in protein digestion (Wang et al. 1989; Moriyasu and Wayne 2004). However, their function in bacteria remains unknown.
The predicted interaction partners of identified cyanobacterial calpains differ significantly among studied cyanobacterial species. None of them has been predicted to interact with calpains in all cyanobacterial species and only few of them have been commonly predicted for two, three or four species. Methionine synthase is putatively interacting with calpains in four cyanobacterial species, while S8 peptidase and glycoside hydrolase family 3 proteins (such as beta-N-acetylhexosaminidase) with calpains in three cyanobacterial species. SecA involved in protein translocation across cytoplasmic and thylakoid membrane, TamB (a component of the translocation and assembly module autotransporter complex) and collagen triple helix repeat protein have been identified as putative calpain interacting partners only in two cyanobacterial species. Other annotated proteins putatively interacting with cyanobacterial calpains have been predicted only for a single cyanobacterial species and almost 40% of predicted interacting partners have been non-annotated proteins (Supplementary Fig. S1). Based on these results, it is currently difficult to draw any meaningful conclusion about a function of cyanobacterial calpains. The predicted interaction partners and the function of cyanobacterial calpains can be experimentally verified in the future.
We also conducted phylogenetic analysis of calpain core CysPC domain to infer the phylogenetic position of cyanobacterial calpains. The phylogenetic analysis revealed the monophyly of bacterial as well as of eukaryotic CysPCs with bootstrap support 97 and 98, respectively (Fig. 5). No horizontal gene transfers of CysPC domain from bacteria to eukaryotes or vice versa were detected using our taxon sampling. This is consistent with the results of Rawlings (2015) whose phylogenetic analysis identified only two recent horizontal gene transfers from eukaryotes to bacteria and no recent horizontal gene transfer from bacteria to eukaryotes. The branching order within the domain Bacteria and within the domain Eukarya does not correspond to real evolutionary relationships of bacterial and eukaryotic taxonomic groups, respectively. CysPC is thus unlikely to be a suitable marker for inferring the evolutionary relationships between organisms and it is also possible that several horizontal transfers of calpains have occurred within bacteria as well as within eukaryotes.
With the exception of S. hofmannii 2, all cyanobacterial CysPC domains are a monophyletic group within bacterial CysPC domains (Fig. 5). The alignment of cyanobacterial CysPC domains also confirms that CysPC domain 2 from S. hofmannii is the most divergent in comparison to other cyanobacterial CysPC domains (Fig. 3). The tree topology also disproves the hypothesis that cyanobacteria, from which chloroplasts of Archaeplastida evolved, were the endosymbiotic donors of archaeplastidial calpains.
The explanation of the origin of eukaryotic calpains depends on the opinion about the origin of eukaryotes themselves. The most popular hypothesis for the origin of eukaryotes suggests that eukaryotes evolved by the endosymbiosis of an alphaproteobacterial ancestor of mitochondria in an archaeal host (Martin and Müller 1998), probably from the group Asgard archaea (Spang et al. 2019; Liu et al. 2021). Since archaea do not possess calpains, while some alphaproteobacteria do, under this scenario, the host archaeal cell could have obtained calpain gene from an alphaproteobacterial endosymbiont. This scenario would be supported if alphaproteobacterial CysPC domains would be placed at the base of eukaryotic CysPCs in the phylogenetic tree with high bootstrap support. Since this is not the case (Fig. 5), our tree does not support alphaproteobacterial origin of eukaryotic calpains. Nevertheless, the hypothesis, that an archaeal ancestor of eukaryotes or the last common ancestor of eukaryotes obtained the calpain gene from an unknown bacterial donor, e.g. via an ancient horizontal gene transfer, cannot be rejected. The scenario that eukaryotic calpains are derived from genes horizontally transferred from a bacterium has been also suggested by Rawlings (2015).
Rawlings (2015) has also proposed that differential distribution of calpains in bacteria is the result of multiple ancient horizontal gene transfers among bacteria rather than multiple gene losses from various bacteria. In our opinion, the alternative hypothesis that both bacterial ancestor as well as eukaryotic ancestor possessed calpain can be still considered. Currently less popular but still plausible hypotheses for the origin of eukaryotes suggest that Archaea and Eukarya are sister groups. The common ancestor of Archaea and Eukarya might have originated from a bacterium (Cavalier-Smith 2002) or these two domains had a common undefined ancestor – a sister lineage of the domain Bacteria (Woese et al. 1990). An undefined archaeo-eukaryotic ancestor might have been even more complex than all contemporary archaea, Archaea domain might have arisen via reductive evolution of this archaeo-eukaryotic ancestor and the differences between genome contents of contemporary archaeal lineages could be explained by differential gene losses (Forterre 2015; Vesteg and Krajčovič, 2011; Vesteg et al. 2012). Considering this scenario, the calpain gene could have been already present in the last universal common ancestor, lost in the ancestor of Archaea, while retained in the ancestor of Bacteria and in the ancestor of Eukarya. Since calpain genes are universally distributed in neither bacteria nor eukaryotes, all mentioned alternative scenarios would require multiple independent losses of calpain genes in various bacterial and eukaryotic lineages.