Identification of AP2/ERF TFs in Liriodendron chinense
Based on the HMM profiles (PF00847) and homology searches, a total of 104 putative AP2/ERF genes that have been designated from LcERF1 to LcERF104 were identified in L. chinense. All these candidates contained one or more AP2/ERF domains according to conservative domain analysis. Then, we described the characteristics of their proteins, including coding sequence (CDS) length, protein length, molecular weight (MW), isoelectric point (PI) and predicted subcellular localization (see Additional file 1: Table S1). Accordingly, the protein lengths of these 104 AP2/ERFs ranged from 100 aa (LcERF29) to 758 aa (LcERF42), with an average of approximately 317 aa (Table 1). Moreover, the molecular weight of the proteins varied from 11.48 kDa (LcERF29) to 84.42 KDa (LcERF42). In addition, the isoelectric point values of these proteins ranged from 4.72 (LcERF27) to 10.22 (LcERF67). The predicted subcellular localization results showed that 83 LcERF proteins were located in the nuclear region, 13 LcERF proteins were located in the chloroplast region, and the remaining genes were distributed in the cytoplasm, mitochondria, plasma membrane and other areas.
Table 1
List of the 104 AP2/ERF genes identified in Liriodendron Chinense
Gene name
|
Gene ID
|
Location
|
Protein length(aa)
|
Introns
|
Family group
|
LcERF1
|
Unigene40981_All
|
Scaffold211
|
261
|
0
|
I
|
LcERF2
|
Lchi03057
|
Scaffold506
|
328
|
1
|
I
|
LcERF3
|
Lchi07965
|
Scaffold708
|
325
|
1
|
I
|
LcERF4
|
Lchi07966
|
Scaffold708
|
561
|
2
|
I
|
LcERF5
|
Lchi22931
|
Scaffold1519
|
432
|
1
|
I
|
LcERF6
|
Lchi09796
|
Scaffold2048
|
316
|
1
|
I
|
LcERF7
|
Lchi16995
|
Scaffold3097
|
316
|
0
|
I
|
LcERF8
|
Lchi23250
|
Scaffold142
|
152
|
0
|
II
|
LcERF9
|
Lchi16170
|
Scaffold408
|
193
|
1
|
II
|
LcERF10
|
Unigene12650_All
|
Scaffold416
|
188
|
0
|
II
|
LcERF11
|
Lchi16911
|
Scaffold480
|
199
|
1
|
II
|
LcERF12
|
Unigene40401_All
|
Scaffold525
|
185
|
0
|
II
|
LcERF13
|
Unigene20830_All
|
Scaffold836
|
186
|
0
|
II
|
LcERF14
|
Lchi11957
|
Scaffold345
|
224
|
1
|
III
|
LcERF15
|
Lchi04946
|
Scaffold530
|
235
|
1
|
III
|
LcERF16
|
Lchi04947
|
Scaffold530
|
295
|
1
|
III
|
LcERF17
|
CL2522.Contig2_All
|
Scaffold530
|
223
|
0
|
III
|
LcERF18
|
CL10877.Contig3_All
|
Scaffold530
|
223
|
0
|
III
|
LcERF19
|
Unigene6126_All
|
Scaffold530
|
211
|
0
|
III
|
LcERF20
|
Lchi33109
|
Scaffold1203
|
229
|
1
|
III
|
LcERF21
|
Lchi33111
|
Scaffold1203
|
227
|
1
|
III
|
LcERF22
|
Lchi34895
|
Scaffold1374
|
513
|
4
|
III
|
LcERF23
|
Lchi29925
|
Scaffold1675
|
425
|
3
|
III
|
LcERF24
|
Lchi08587
|
Scaffold39
|
420
|
1
|
III
|
LcERF25
|
CL5589.Contig2_All
|
Scaffold345
|
203
|
0
|
III
|
LcERF26
|
Unigene11386_All
|
Scaffold432
|
211
|
0
|
III
|
LcERF27
|
CL5589.Contig1_All
|
Scaffold530
|
246
|
0
|
III
|
LcERF28
|
Lchi00950
|
Scaffold723
|
217
|
0
|
III
|
LcERF29
|
Lchi01616
|
Scaffold1191
|
100
|
1
|
III
|
LcERF30
|
Unigene5530_All
|
Scaffold1191
|
252
|
0
|
III
|
LcERF31
|
Lchi32377
|
Scaffold1289
|
210
|
0
|
III
|
LcERF32
|
Lchi26370
|
Scaffold1364
|
193
|
1
|
III
|
LcERF33
|
Lchi08922
|
Scaffold3419
|
244
|
0
|
III
|
LcERF34
|
Lchi28169
|
Scaffold654
|
418
|
1
|
IV
|
LcERF35
|
Lchi23878
|
Scaffold1043
|
141
|
0
|
IV
|
LcERF36
|
Lchi22387
|
Scaffold1263
|
229
|
1
|
IV
|
LcERF37
|
Lchi13652
|
Scaffold1315
|
429
|
1
|
IV
|
LcERF38
|
Lchi30363
|
Scaffold2365
|
475
|
3
|
IV
|
LcERF39
|
Lchi30365
|
Scaffold2365
|
354
|
1
|
IV
|
LcERF40
|
Lchi31374
|
Scaffold3032
|
404
|
1
|
IV
|
LcERF41
|
Lchi34724
|
Scaffold3708
|
355
|
1
|
IV
|
LcERF42
|
Lchi10868
|
Scaffold159
|
758
|
2
|
V
|
LcERF43
|
Lchi11945
|
Scaffold345
|
421
|
7
|
V
|
LcERF44
|
Lchi25937
|
Scaffold1371
|
183
|
1
|
V
|
LcERF45
|
Lchi16637
|
Scaffold2432
|
226
|
1
|
V
|
LcERF46
|
Lchi34468
|
Scaffold2926
|
206
|
1
|
V
|
LcERF47
|
Lchi05084
|
Scaffold3476
|
136
|
1
|
V
|
LcERF48
|
Lchi07311
|
Scaffold172
|
268
|
0
|
VI
|
LcERF49
|
Lchi22103
|
Scaffold920
|
369
|
2
|
VI
|
LcERF50
|
Lchi17039
|
Scaffold3097
|
359
|
1
|
VI
|
LcERF51
|
Lchi02638
|
Scaffold416
|
310
|
1
|
VII
|
LcERF52
|
Lchi02639
|
Scaffold416
|
362
|
1
|
VII
|
LcERF53
|
Lchi11452
|
Scaffold525
|
383
|
1
|
VII
|
LcERF54
|
Lchi04620
|
Scaffold775
|
290
|
1
|
VII
|
LcERF55
|
Lchi04621
|
Scaffold775
|
289
|
1
|
VII
|
LcERF56
|
Lchi04623
|
Scaffold775
|
236
|
1
|
VII
|
LcERF57
|
Lchi07083
|
Scaffold135
|
273
|
1
|
VIII
|
LcERF58
|
Lchi07084
|
Scaffold135
|
203
|
0
|
VIII
|
LcERF59
|
Lchi13371
|
Scaffold1075
|
207
|
1
|
VIII
|
LcERF60
|
Lchi11824
|
Scaffold1130
|
375
|
3
|
VIII
|
LcERF61
|
Lchi13392
|
Scaffold1763
|
207
|
1
|
VIII
|
LcERF62
|
Unigene7795_All
|
Scaffold1763
|
205
|
0
|
VIII
|
LcERF63
|
Lchi31572
|
Scaffold1784
|
180
|
1
|
VIII
|
LcERF64
|
Lchi08484
|
Scaffold39
|
204
|
1
|
IX
|
LcERF65
|
Lchi09908
|
Scaffold79
|
316
|
1
|
IX
|
LcERF66
|
Unigene24905_All
|
Scaffold79
|
170
|
0
|
IX
|
LcERF67
|
Lchi01406
|
Scaffold432
|
211
|
1
|
IX
|
LcERF68
|
Unigene35921_All
|
Scaffold432
|
301
|
0
|
IX
|
LcERF69
|
Lchi08172
|
Scaffold580
|
324
|
0
|
IX
|
LcERF70
|
Lchi07909
|
Scaffold708
|
174
|
1
|
IX
|
LcERF71
|
Lchi31530
|
Scaffold803
|
105
|
2
|
IX
|
LcERF72
|
CL9762.Contig1_All
|
Scaffold1024
|
307
|
0
|
IX
|
LcERF73
|
Lchi05992
|
Scaffold1024
|
250
|
1
|
IX
|
LcERF74
|
Lchi05993
|
Scaffold1024
|
350
|
2
|
IX
|
LcERF75
|
Lchi26525
|
Scaffold1934
|
272
|
1
|
IX
|
LcERF76
|
Lchi26532
|
Scaffold1934
|
361
|
3
|
IX
|
LcERF77
|
Lchi28702
|
Scaffold54
|
309
|
2
|
X
|
LcERF78
|
Unigene10666_All
|
Scaffold100
|
235
|
1
|
X
|
LcERF79
|
Lchi02215
|
Scaffold682
|
229
|
1
|
X
|
LcERF80
|
Lchi02216
|
Scaffold682
|
196
|
1
|
X
|
LcERF81
|
Lchi18461
|
Scaffold943
|
403
|
1
|
X
|
LcERF82
|
Lchi11856
|
Scaffold1130
|
128
|
1
|
X
|
LcERF83
|
Lchi01932
|
Scaffold1191
|
292
|
1
|
X
|
LcERF84
|
Lchi20453
|
Scaffold1167
|
328
|
2
|
VI-L
|
LcERF85
|
Lchi14855
|
Scaffold41
|
394
|
6
|
AP2
|
LcERF86
|
Lchi23120
|
Scaffold192
|
405
|
7
|
AP2
|
LcERF87
|
Lchi16948
|
Scaffold480
|
550
|
9
|
AP2
|
LcERF88
|
Lchi08043
|
Scaffold502
|
680
|
6
|
AP2
|
LcERF89
|
Lchi11241
|
Scaffold503
|
375
|
8
|
AP2
|
LcERF90
|
Lchi28881
|
Scaffold509
|
474
|
7
|
AP2
|
LcERF91
|
Lchi06162
|
Scaffold527
|
524
|
7
|
AP2
|
LcERF92
|
Lchi03252
|
Scaffold764
|
535
|
6
|
AP2
|
LcERF93
|
CL7987.Contig2_All
|
Scaffold805
|
572
|
12
|
AP2
|
LcERF94
|
Unigene5404_All
|
Scaffold2118
|
490
|
6
|
AP2
|
LcERF95
|
Lchi33401
|
Scaffold2225
|
563
|
6
|
AP2
|
LcERF96
|
Lchi13837
|
Scaffold2467
|
662
|
7
|
AP2
|
LcERF97
|
CL6967.Contig2_All
|
Scaffold2956
|
327
|
6
|
AP2
|
LcERF98
|
Unigene39546_All
|
Scaffold3476
|
468
|
7
|
AP2
|
LcERF99
|
Lchi08779
|
Scaffold67
|
376
|
1
|
RAV
|
LcERF100
|
Lchi02516
|
Scaffold100
|
607
|
4
|
RAV
|
LcERF101
|
Lchi02519
|
Scaffold100
|
428
|
2
|
RAV
|
LcERF102
|
Lchi15640
|
Scaffold1242
|
354
|
2
|
RAV
|
LcERF103
|
Lchi23744
|
Scaffold1330
|
361
|
1
|
RAV
|
LcERF104
|
Lchi32356
|
Scaffold3563
|
235
|
5
|
Soloist
|
Phylogenetic analysis and classification of LcERF genes
On the basis of conservative domain analysis and multiple alignments of LcAP2/ERF protein sequences, consistent with the classification results in Arabidopsis, the 104 LcERF proteins were categorized into four subfamilies, including ERF, AP2, RAV, and Soloist subfamilies. All 84 ERF genes contained a single AP2/ERF domain, and based on the characteristics of the amino acid sequences and domains which they encode, these genes were further divided into two subfamilies, which were named the DREB and ERF subfamilies and covered 41 and 43 members, respectively. However, among the remaining genes, 14 genes were identified as members of the AP2 family owing to their tandemly repeated double AP2/ERF domain. In addition, 5 genes that did not possess a single AP2/ERF domain but displayed a B3 domain were classified in the RAV subfamily. The last one, LcERF104, is homologous with the Arabidopsis Soloist gene (At4g13040) and was classified in the Soloist subfamily. According to the description of Nakano’s study [19], the DREB subfamily is comprised of four parts, named I, II, III, and IV, which contain 7, 6, 20, and 8 members, respectively. The ERF subfamily genes can be divided into seven groups based on phylogenetic analysis and belong to the V, VI, VII, VIII, IX, X, and VI-L parts, with 6, 3, 6, 7, 13, 7, and 1 members, respectively. The sequence alignment of LcERF genes showed that the WLG element was highly conserved in the ERF, DREB and RAV subfamilies but less conserved in the AP2 subfamily. However, the RAYD, AA, and other elements were conserved in the AP2 subfamily (Fig. 1).
The evolutionary relationships of all the candidate genes were further illustrated by phylogenetic analysis. According to the unrooted tree profile, AP2, RAV and Soloist were clustered to a separate branch within the subfamily. However, ERF genes were divided into 2 large branches, the ERF branch and the DREB branch, and the ERF and DREB branches were divided into 7 and 4 groups, respectively (Fig. 2). Moreover, these findings coincided with the grouping of the ERF subfamily described in part 3.1 based on the conserved motifs (Fig. 1). In addition, this result showed the same clustering pattern as that obtained by the classification method based on alignment with Arabidopsis (Table 1). As a result, we propose that these 104 putative genes are indeed AP2/EFR family genes in L. chinense.
Gene structure and conserved motif analysis of LcERF genes
To further understand the structural composition of LcERF genes, we analyzed the genomic DNA sequence using the online Gene Structure Display Server, with the locations of exons and introns provided by the Liriodendron genomic resource. According to the structural characteristics of LcAP2/ERF genes, the number of introns varied among the distinct subfamilies (Fig. 3A). Except for a few members carrying more than one intron, most of the DREB and ERF subfamily genes have only one intron or even no introns in their genomic DNA. In the AP2 subfamily, all the genes possess numerous introns, with intron numbers ranging from 6 to 12. Furthermore, LcERF93 is considered to have the most introns with 12, even though most AP2 genes contain 6 or 7 introns. Moreover, four of the five RAV members possessed one or two introns, and the single Soloist member contained five introns. In addition, the position of introns also presented interesting differences in different subfamilies. As far as the sequences with an intron are concerned, the position of their intron was mostly near the N-terminal or C-terminal, rarely in the middle of the sequence, because these sequences usually consist of a long exon and a very short exon. In general, the members with close evolutionary relationships and those from the same subfamily had similar exon and intron structures in terms of intron number and position and exon length.
Conserved motifs of 104 LcERF genes were identified using the MEME (Multiple Em for Motif Elicitation) tool. A total of 15 conserved motifs were displayed in the 104 LcERF proteins. The amino acid length of the 15 motifs ranged from 15 to 50. As AP2 DNA-binding motifs, motif 1 and motif 2 joined together and appeared in both DREB and ERF subfamilies, expect for special cases of motif 1, which also existed independently in the RAV and Soloist subfamilies. Even though most of the ERF subfamily members shared the two conserved motifs of motif 1 and motif 2, the other motifs were varied in the different proteins (Fig. 3B). In the DREB subfamily, proteins contained relatively more conserved motifs than in the ERF subfamily, especially in group III and group IV. Motifs 5, 7, 8 and 15 were detected in some group III proteins, and motifs 9, 10, 11 and 12 were found in most group IV members. In the AP2 subfamily, all 14 proteins carried motif 3 and motif 6, and some of them also had motif 4 and motif 12. In addition, motifs 13 and 14 only existed in the B3 domain and were also considered specific to the RAV subfamily.
Expression profiles of LcERF genes in different tissues
We investigated the expression profiles of LcAP2/ERF genes in various tissues by Illumina RNA-Seq data[31] and constructed a heatmap, revealing that 86 LcERF genes were detected in the various tissues, including 34 genes in the DREB subfamily, 35 genes in the ERF subfamily, 12 genes in the AP2 subfamily, 4 genes in the RAV subfamily, and one Soloist gene. To explore the differential expression of these genes in different tissues, the FPKM values were standardized by row with TBtools software. Then, the standardized results were clustered by row and column (Figure 4A). The results showed that several genes were expressed in all tissues and clustered in a large group. In addition, the column cluster divided the other genes based on their different expression patterns, including pistil-specific, stamen-specific, leaf-specific, shoot-specific and other patterns.
Expression patterns of LcERF genes and discovery of shoot-specific genes
To reveal genes involved in shoot and leaf development, we intentionally focused on genes that were expressed specifically in the shoot tissue. All the LcAP2/ERF family genes were divided into ten clusters in accordance with the K-means method in the STEM program. Accordingly, cluster IV and cluster V showed tissue-specific expression in leaves and shoots, respectively. Cluster V contained eight genes, while cluster IV contained only one (Figure 5A). In addition, based on enrichment analysis, the expressed genes were categorized into different LcAP2/ERF groups with an adjusted p-value. Interestingly, six of eight genes showed significant enrichment in cluster V, while the single member of cluster IV failed to pass the significance test (Figure 5B). Among these six genes, three genes are part of the LcERF VIII group (LcERF57, LcERF58 and LcERF63), and another three genes belong to the LcAP2 subfamily (LcERF94, LcERF96 and LcERF98).
We then annotated the functions of these six genes by submitting sequences to The Arabidopsis Information Resource (TAIR) database. Through alignment and annotation, all three genes from the LcAP2 subfamily were mapped to the AINTEGUMENTA (ANT) or AINTEGUMENTA-like (AIL) gene, which is also considered to be involved in the maintenance of the shoot apical meristem (GO:0010492), the auxin-mediated signaling pathway involved in phyllotactic patterning (GO:0060774), plant organ morphogenesis (GO:1905392), cell division (GO:0051301) and cell growth (GO:0016049). However, LcERF57, LcERF58 and LcERF62 from the LcERF VIII group showed extensive functions, such as negative regulation of the ethylene-activated signaling pathway (GO:0010105, GO:0009873) and glucosinolate metabolic process (GO:0019760).
Potentiality of shoot-specific gene involvement in shoot and leaf development
The expression of the six candidate genes showing shoot-specific patterns was further verified using RT-qPCR. We determined the expression of these six genes in seven tissues, including leaf, shoot, sepal, petal, stamen, pistil and stem tissues (Figure 4B). Consistent with the RNA-seq results, the patterns of LcERF94, LcERF96 and LcERF98, which are from the AP2 subfamily, were relatively shoot specific. LcERF58 and LcERF63, which are from Group VIII in the ERF subfamily, were primarily expressed in shoot as well as flower. However, LcERF57 was not amplified from cDNA or gDNA after repetitive optimization of primer design and the amplification conditions; as a result, we did not describe its function in the following assay (details are not discussed in this article). Considering the potential functions annotated in the NCBI GenBank and Gene Ontology (GO) databases, dual roles may exist for these three genes in Group VIII, as we inferred. Moreover, this conjecture has been clarified in previous studies, which have proven that ERF VIII subgroup genes play an important role in in vitro shoot regeneration and development [32].
In addition to these expression patterns, we further illustrated the molecular functions by separating shoots into multiple layers of tender leaves and then detecting the expression of the candidate genes in different leaf development stages (Figure 6A). RT-qPCR was performed to detect the expression of target genes from P1 to P6 as well as the SAM (Figure 6B). The results revealed that different expression patterns were present among different subfamily genes. Specifically, expression of the LcERF58 genes was gradual, rising from P1 to P6, yet the exact opposite was observed for LcERF94, LcERF96 and LcERF98 from the LcAP2 subfamily as well as LcERF63 from the ERF group VIII. Opposite expression patterns in leaf primordia (P1 ~ P2) and tender leaves (P3 ~ P6) suggest that the functions of these genes were fairly different in regulating shoot and leaf development. Compared to the expression of LcERF58/63, LcERF94/96/98 were specifically expressed in shoots, and their expression was almost 100 times that of other tissues. These results also indicated that LcERF94/96/98 genes may play an essential as well as unique role both in the development and morphogenesis of shoots and leaves.
Subcellular localization of LcERF genes
To investigate the potential function of AP2 genes in transcriptional regulation, we detected the subcellular localization of LcERF94/96/98 using young tobacco leaves. Confocal microscopy was used to observe and photograph the transient transformed lower epidermal cells of tobacco leaves, and visible, GFP fluorescence, chlorophyll fluorescence and merge field images were obtained (Fig. 7). 35S::GFP, as a control sample, showed GFP fluorescence in the whole cell. The GFP fluorescence of pBI121-35S::GFP-LcERF94/96/98 were observed only in the nucleus, which is consistent with the characteristics of TFs, and these histological observations demonstrated the alleged role of LcERF94/96/98 as TFs localized in the nucleus.