Identification of HM gene family in oil palm genome
In this investigation, a total of 109 EgHM gene family members such as 48 of HMTs (histone methyltransferases); 27 of HDMs (histone demethylases); 13 of HATs (histone acetylases), and 21 of HDACs (histone deacetylases) were successfully identified in the oil palm genome via genome-wide analysis. All the EgHM family members are categorized into 11 subfamilies (SDG, PRMT, HDMA, JMJ, HAG, HAM, HAC, HAF, HAD, SRT & HDT) based on their protein domain architecture. The HMT family is included with 39SDGs and 9PRMTs; HDM family is contained with 3HDMAs and 24JMJs; HAT family is included with 4HAGs, 2HAMs, 6HACs, and 1HAF; HDAC family included with 15HADs, 3SRTs, and 3 HDT sub-families. All of the gene IDs of EgHM family members were provided in Supplementary Table 1. The pI and Molecular weight of the oil palm HM family members were ranged from 4.57 to 9.95 and 12.3 to 273.3 respectively (Supplementary Table 1 ). The EgHMs are predominantly localized to the cytoplasmic regions. (Supplementary Table 2).
Gene structure and conserved motif analysis of EgHM family members
We analyzed the gene structure of all EgHM gene family members using the Gene Structure Display Server tool. We found the occurrence of varied numbers (2–34) of exons among the HM gene family members (Fig. 1). The highest number (34) of exons was observed in EgHMT members i.e. EgSDG18 and the least number (2) of exons were observed in EgSDG39(Fig. 1). We found10 conserved motifs among the 109 HM gene family of oil palm(Fig. 2).
Chromosomal distribution of EgHM members in oil palm genome
The chromosomal distribution of 109 EgHM gene family members was also examined across the 16 chromosomes of the African oil palm genome. The EgHM gene family members were unevenly distributed on the chromosomes(Fig. 3). Among the identified 109EgHMs, only 86 were mapped across the 16 chromosomes(Fig. 3). We have not observed the mapping of 26 EgHM members on any of the chromosomes. Chromosomes 1 and 9 had the highest number (9) of EgHM family genes, whereas chromosome 14 had only one EgHM family gene (Fig. 3). Apart from chromosome 14, all of the remaining chromosomes at least contained 3 or more EgHM family genes.
EgHM gene duplication in oil palm genome
To know the expansion of the HM gene family in oil palm genome, we generated a gene duplication event diagram for duplicated blocks using a Circos algorithm. In a total, 37 pairs of EgHMs were identified from 16 chromosomes of oil palm, including 14 pairs of EgSDGs; 4 pairs of EgPRMTs; 6 pairs of EgJMJs; 2 pairs of EgHAMs; 1 pair of EgHACs; 9 pairs of EgHDAs, and 1 pair of EgSRTs(Fig. 4). The paired EgHM duplicated genes were all located in different and same chromosome blocks of oil palm genome(Fig. 4). Moreover, Chromosome 1 and Chromosome 6 had 6 and 5 number of duplicated genes respectively. However, chromosome 14 block had no duplicated HM genes(Fig. 4). These results demonstrated the expansion of EgHM gene family that occurred through these duplicated regions.
Phylogenetic analysis between oil palm, rice, and Arabidopsis HM gene family
To elucidate the evolutionary relationship between oil palm, rice, Arabidopsis, we generated the rooted phylogenetic trees for each HM gene family (HAT, HDAC, HDM, and HMT) (Figs. 5, 6,7 and 8). The phylogenetic tree for subfamilies of each HM gene family was classified and clustered into diverse trends. The phylogenetic tree of HAT family showed that all the HAG, HAM, HAC, and HAFs of oil palm, rice, and Arabidopsis were not clustered together and they clustered in a species-specific manner and also in mixed type (Fig. 5). For HDAC family, HDTs and SRTs clustered in a species-specific manner, whereas HDAs clustered together (Fig. 6). The phylogenetic tree of HDMs (HDMAs and JMJs) also showed a different trend with a species-specific type of clustering (Fig. 7). The phylogenetic tree of HMT family revealed that all the SDGs and PRMTs genes were clustered together (Fig. 8). Altogether, our results indicated that there is a clear evolutionary relationship and diversification between oil palm, rice, and Arabidopsis HM gene family members.
Expression levels of EgHMs in different somatic embryogenic stages (Embryogenic calli, Non-embryogenic calli, and somatic embryos)
Based on the available transcriptome data of oil palm EC, NEC, and SE (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA699335) stages were downloaded from the NCBI website and analyzed the expression levels of all identified 109 EgHM gene family members. The transcript abundance of EgHMs in various stages of oil palm somatic embryogenesis was analyzed by generating the heatmap with the help of FPKM values. As shown in Fig. 9, EgHM gene members showed differential expression in different stages of somatic embryogenesis of oil palm. However, most of the genes were down-regulated in all three stages of somatic embryogenesis (Fig. 9). The majority of the HMTs (SDGs & PRMTs) are expressed during EC and NEC stages (Fig. 9). The HDAC gene EgHDT1 is highly expressed in all three stages of somatic embryogenesis, whereas EgHDA15 is highly expressed in the somatic embryo development stage(Fig. 9). Our results elucidated the role of specific EgHMs during the conversion of NEC to SE during oil palm somatic embryogenesis.
Real time-PCR expression analysis of candidate EgHM family genes
A total of 99 EgHM genes were selected and analyzed their relative expression levels during different stages (NEC, EC & SE) of somatic embryogenesis of oil palm through q-RTPCR. Our results revealed the varied expression levels of selected EgHMs during various stages of somatic embryogenesis. The relative expression of HMTs (PRMTs & SDGs) was significantly higher in EC and SE stages than NEC stage(Fig. 10). EgPRMT2 is highly expressed in EC and SE stages, whereas EgPRMT8 has shown the highest expression in SE stages of somatic embryogenesis (Fig. 10). EgPRMT2 & 5 have shown similar expression in all three stages of somatic embryogenesis. EgSDG24, 28, 35 were highly expressed in SE stage and EgSDG18 has shown the highest level of expression in NEC stage. EgSDG19, 20, and 37 have shown significantly higher expression in EC stage(Fig. 10). The relative expression of HDMs (HDMAs & JMJs) was significantly higher in EC and SE stages, whereas EgJMJ20 has shown the highest expression in NEC stage and also in SE stage(Fig. 10). The relative expression of HATs was also significantly higher in both EC and SE stages. Some of them have shown the highest expression in specific stages (either in EC or SE) (Fig. 10). The relative expression of a majority of HDACs is also higher in EC and SE stages than NEC. EgSRT1 and EgHDA15 were highly expressed in SE stage (Fig. 10). Taken together, all these results indicating the potential role of some EgHMs in somatic embryogenesis of oil palm.