GRAS proteins, a group of plant-specific transcription regulators, are named after the acronyms of three initially identified members: GAI, RGA and SCR. Typically, GRAS proteins are composed of 400–770 amino acid residues (Bolle, 2004; Pysh et al., 1999), and contain several highly-conserved motifs at their C-termini but great variation in length and sequence at their N-termini. The consecutive conserved motifs at C-terminal region include LHR I, VHIID, LHR II, PFYRE and SAW (Pysh et al., 1999; Sun et al., 2011), which contribute to protein function. The structure of VHIID with its flanking two leucine heptad repeats (LHR I and LHR II) is critical for protein–protein interaction. The mutagenesis of PFYRE and SAW motifs displayed distinct phenotype abnormality in Arabidopsis thaliana, indicating that they may contribute to the structural integrity of GRAS family (Wang et al., 2014; Itoh et al., 2002; Silverstone, Ciampaglio & Sun, 1998). Except for two conserved N-terminal motifs (DELLA and TVHYNP) characterized only for the members of DELLA subgroup, N-termini of GRAS proteins display large divergence, which may determine functional specificity of such regulatory proteins (Sun et al., 2011).
Recently, GRAS genes have been characterized in a number of plant species, such as A. thaliana, rice (Oryza sativa), tomato (Solanum lycopersicum), poplar (Populus trichocarpa), Chinese cabbage (Brassica rapa ssp. pekinensis), maize (Zea mays), Medicago truncatula and pine (Pinus radiata) (Abarca et al., 2014; Huang et al., 2015; Lu et al., 2015; Ma et al., 2010; Song et al., 2014; Tian et al., 2004; Zhang et al., 2017). According to the conserved motifs and sequence similarity, GRAS family members in two model plants, Arabidopsis and rice, were classified into eight distinct subfamilies, namely DELLA, HAM, LISCL, PAT1, LAS, SCR, SHR and SCL3 (Tian et al., 2004). However, the number of subfamily was ranged from eight to 16 in other plants such as Prunus mume, tomato and maize, suggesting that species-specific subfamily may exist in those plants unexamined yet.
The known studies demonstrated that GRAS proteins function in various physiological processes during plant growth and development, including axillary meristem formation, root development, gametogenesis, phytochrome and gibberellin acid (GA) signal transduction and the response to stresses (Cenci & Rouard, 2017). Considering the fact that amino acid sequences in each subfamily are highly homologous, each group might possess distinct functions. For example, SCR and SHR, are both found to regulate root and shoot radial organization via a SCR/SHR complex in Arabidopsis (Cui et al., 2007; Helariutta et al., 2000). DELLA members usually act as the inhibitors of GA signaling perception (Sun & Gubler, 2004). SCL3 mainly expressed in endodermis is essential for integrating downstream pathways of SCR/SHR and GA/DELLA, and controlling GA homeostasis during root development (Zhang et al., 2011). AtSCL13 from the PAT1 subfamily participate in phytochrome-B (phyB) signal transduction (Bolle, Koncz & Chua, 2000), whereas other members (PAT1, SCL5 and SCL21) from the same subfamily mainly function as positive regulators mediating phyA signaling pathway to control plant development (Torres-Galea et al., 2006). OsMOC1, a putative GRAS protein, has been proven as a positive regulator of rice tillering, important in the direct control of grain yield (Li et al., 2003). Although these GRAS members were functionally characterized in model plants, large amount of GRAS proteins remain to be elucidated for their functions, particularly in agricultural plants.
Pepper (Capsicum annuum L.) is an economically important vegetable. It has tremendous value for providing food, spice, coloring agent, pharmaceuticals and ornamental products (Kim et al., 2014; Qin et al., 2014). In 2013, the total pepper production in the world already reached 34.9 million tons, making it the second largest Solanaceae crop after tomato (Kim et al., 2014). The accomplishment of whole genome sequencing in 2014 provides a platform for us to conduct genome-wide analysis for an entire gene family and to explore the right gene which is critical for pepper growth and development (Kim et al., 2014; Qin et al., 2014). By far, transcription factor families, such as WRKY, Dof, SBP-Box and Hsp70 have been characterized in pepper (Guo et al., 2015b, 2016; Wu et al., 2016). However, pepper GRAS proteins and their functional specificity have not been characterized in detail. Here, we firstly describe the entire members of GRAS family in pepper using comparative genomic tools and experimental verification. A total of 50 CaGRAS genes were identified from pepper genome. The intron/exon organization and protein structure of each GRAS member were also characterized, together with their phylogenetic relationships and chromosomal locations. Subsequently, we examined the function diversity of CaGRAS members by conserved motif analysis, followed by real-time PCR to profile their expression patterns in different tissues and various stress treatments. The present study provides essential knowledge to further illuminate molecular functions of GRAS genes in regulation of pepper growth and development as well as environmental responses.
Materials and Methods
Identification and annotation of pepper GRAS genes
Whole genome data for pepper cv. CM334 and cv. Zunla-1 were used for this study, and their genomic information were downloaded from http://peppergenome.snu.ac.kr/download.php and http://peppersequence.genomics.cn/, respectively (Kim et al., 2014; Qin et al., 2014). Arabidopsis GRAS genes were obtained from TAIR (https://www.Arabidopsis.org/), whereas rice GRAS genes were downloaded from RGAP (http://rice.plantbiology.msu.edu) (Tian et al., 2004). The tomato GRAS information was obtained from SGN (https://solgenomics.net/) (Niu et al., 2017). The latest Hidden Markov Model (HMM) of GRAS domain (PF03514.11) (http://pfam.sanger.ac.uk/) was used as a BLAST query to search against the entire protein datasets of cv. CM334 and cv. Zunla-1 with an E-value of 1e−5 using HMMER 3.0 (Huang et al., 2015). Meanwhile, all AtGRAS and OsGRAS proteins were used as queries to search against the two pepper databases using default parameters. The length of the hit out of the range from 350 to 820 aa was rejected. In order to validate their putative accuracy, conserved domains essential for GRAS proteins were evaluated by SMART (http://smart.embl-heidelberg.de/) and PFAM database. Finally, all outputs from two independent databases were aligned and those having similar GRAS core domain were deemed as the same gene. After these stringent criterions, sequences with the presence of GRAS domain were retained for further analysis. In our study, we refer to the variety cv. CM334 as the reference for subsequent whole genome-wide analysis.
Phylogenetic analysis of CaGRAS genes
All screened GRAS proteins from pepper, Arabidopsis, rice and tomato were used for multiple alignments by ClustalW program (Larkin et al., 2007). Gene IDs of GRAS members used in this study were listed in Table S1. Arabidopsis and rice are most common used model plants for researching genetic correlations, and tomato is another model plant of the Solanaceae family, which is closely related to pepper. Then maximum likelihood method was adopted to generate an unrooted phylogenetic tree based on alignment results. Reliability of phylogenetic tree was estimated with 1,000 bootstrapping replicates (Tamura et al., 2013). GRAS members in pepper were further categorized into different subfamilies based on well-established classifications in Arabidopsis (Tian et al., 2004).
Protein property and gene structure analysis
With the help of multiple expectation maximization for motif elicitation (MEME, http://meme-suite.org/), conserved motifs of GRAS proteins were scanned with the following parameters: (1) maximum number of motif was 12; (2) optimum motif width was set from 6 to 50 aa (Bailey et al., 2009). These identified motifs were further validated using InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/) (Mulder & Apweiler, 2007). The properties of GRAS proteins were calculated on ExPASy online server (http://web.expasy.org/), such as molecular weight (MW), isoelectric point (pI), instability index and GRAVY value (grand average of hydropathy) (Gasteiger et al., 2003). Based on the relationship of coding sequence and its corresponding genomic DNA sequence, the final exon/intron distribution of each CaGRAS gene was illustrated by GSDS 2.0 (gene structure display server, http://gsds.cbi.pku.edu.cn/) (Hu et al., 2015).
Chromosomal mapping and gene duplication analysis
Physical position of each CaGRAS gene was extracted from pepper genome annotation file, and plotted onto the corresponding chromosome using Mapchart 2.3 (Voorrips, 2002). We renamed each CaGRAS gene according to its ascending chromosomal distribution. Duplicated gene pairs and patterns in pepper were analyzed by using MCScanX and BLASTP (Wang et al., 2012). Tandem duplicated genes were characterized as contiguous homologous genes located in a 100 kb single region or separated by less than five genes, while the whole blocks of genes copying from one chromosome region to another were defined as segmental duplications (Tang et al., 2008). Subsequently, non-synonymous (Ka) and synonymous substitution (Ks) between duplicated CaGRAS gene pairs were calculated by PAL2NAL (http://www.bork.embl.de/pal2nal/) (Suyama, Torrents & Bork, 2006). The microsyntenic map of precise region containing GRAS genes among pepper, tomato and Arabidopsis was created by MCScanX and plotted using Circos (Krzywinski et al., 2009; Wang et al., 2012).
Prediction of CaGRAS protein–protein interaction network
To further clarify the relationships between CaGRASs, a protein–protein interaction network was predicted using their interologs from Arabidopsis. First, specific interolog relationships between Arabidopsis AtGRASs and pepper CaGRASs were mapped from INPARANOID database (http://inparanoid.sbc.su.se/cgi-bin/gene_search.cgi) (Remm, Storm & Sonnhammer, 2001). Then, we retrieved the interaction information among AtGRASs from AraNet database (http://www.functionalnet.org/aranet/) and mapped these attributions to CaGRASs to generate corresponding interaction relationships for pepper (Guo et al., 2015b; Lee et al., 2010). Finally, these interaction networks among CaGRASs were visualized using Cytoscape version 3.4.0 (Shannon et al., 2003).
Expression analysis of CaGRAS genes in different tissues
The public transcriptome data of leaf, stem, root, pericarp and placenta at mature green, breaker, five and 10 days post-breaker, six, 16 and 25 days post-anthesis (PC-MG, PL-MG, PC-B, PL-B, PC-B5, PC-B10, PL-B5, PL-B10, PC-6DPA, PC-16DPA, PC-25DPA, PL-6DPA, PL-16DPA, PL-25DPA) for pepper cv. CM334 have been previously generated (Guo et al., 2015a; Kim et al., 2014). We retrieved the fragments per kilobase per million reads value representing the expression level of each CaGRAS gene and displayed the result using BAR Heatmapper Plus.
Pepper plant preparation and stress treatments
Pepper plants were grown on soil in greenhouse with conditions: 14/10 h photoperiod, 25/20 °C day/night temperature and 60% relative humidity. In this study, pepper seedlings with 6–8 true leaves were randomly divided into five groups, namely control (untreated) and treatment with cold (4±1 °C), salt (300 mM NaCl), drought (400 mM mannitol) and gibberellin solution (20 μM GA). Leaves were sampled at 3 h after the treatment. For each treatment, leaves from five randomly selected seedlings were bulked to form one sample, and six biological replicate samples were immediately frozen in liquid nitrogen and then stored at −80 °C before use.
RNA isolation and qRT-PCR analysis
Total RNA from leaves was extracted using Total RNA kit (BioTeke, Beijing, China) and reversely transcribed into cDNA using M-MLV Reverse Transcriptase (Promega, Madison, WI, USA). Real-time quantitative PCR (qRT-PCR) experiment was done using SYBR GREEN I Master Mix (Applied Biosystems, Waltham, MA, USA) on iCycler iQ™ thermocycle (Bio-Rad, Hercules, CA, USA). Each reaction volume contained 12.5 μl of SYBR GREEN Mix, 1 μl of each primer, 5 μl of 10 × diluted cDNA, and 5.5 μl of nuclease-free water. The reaction program was set as follows: initial polymerase incubation at 95 °C for 10 min, then 40 cycles of 95 °C for 15 s, 60 °C for 45 s. Melting curve analysis was conducted with heating the PCR product from 60 °C to 95 °C for verifying the specificity of the primers. The relative expression levels of CaGRAS genes were calculated based on the comparative Ct method using the 2−ΔΔCt method with the actin1 as an internal reference gene. Primer pairs were designed by Primer Premier 5.0 and checked by NCBI Primer BLAST (Table S3).
Genome-wide identification of GRAS gene family in pepper
We employed two different approaches to identify GRAS genes in pepper genome. Totally, 50 non-redundant CaGRAS genes were found from variety cv. CM334, concurrent with the corresponding genes from cv. Zunla-1 (Table 1). Nearly all these proteins contained one representative GRAS domain (PF03514.11), with the exception of three CaGRASs (CA00g84110, CA01g26680 and CA00g84090) that had more than one such domain. The molecular mass and length of CaGRAS proteins varied greatly, with MWs ranging from 48 to 87 KDa and length from 419 to 801 aa. The average theoretical pI was 6.1, implying that most CaGRAS proteins were weakly acidic. Only CaGRAS21 was stable because of its instability index less than 40, whereas the rest were considered as unstable. All CaGRASs were predicted to be hydrophilic due to the less GRAVY value (<0) of each protein. Most of CaGRAS proteins contained large percentage of aliphatic amino acids, with predicted aliphatic index ranging from 65.74 to 95.76. Interestingly, most of CaGRAS genes (84%) were intronless, while seven members had just one intron. Only one CaGRAS gene had two introns (Fig. 1).
|ID||Name||Chr||Position (Mb)||Group||Length (aa)||MW (KDa)||pI||Aliphatic index||Instability index||GRVY||Corresponding ID in Zunla-1|
Chromosomal localization and gene duplication analysis of CaGRAS genes
Except for six members (CaGRAS45-50) mapped to the scaffolds, the remaining 44 CaGRAS genes were unevenly distributed across 11 out of 12 pepper chromosomes. Among those anchored members, Chr7 occupied the largest number of GRAS genes (n = 7; 15.22%), followed by Chr1 (n = 6; 13.04%) and three chromosomes (Chr2, Chr5 and Chr12) each having five members. Additionally, four GRAS genes were located on Chr4, while three genes were detected on Chr3, Chr4 and Chr9, respectively. Two GRAS genes were found on Chr8, and only one was on Chr10. Notably, most of CaGRAS genes were gathered at both ends of chromosomes.
Furthermore, we analyzed the duplication events of CaGRAS gene in pepper genome since gene duplication acts importantly on the occurrence of novel functions and gene family expansion. As shown in Fig. 2, two pairs of tandem duplicated genes (CaGRAS4/5 and CaGRAS20/21) located on Chr1 and Chr5, respectively. Additionally, 10 pairs of CaGRAS genes were identified as segmental duplications (Fig. 3). We found that all duplicated gene pairs had Ka/Ks ratios less than 0.5, suggesting these genes experienced strong purifying selection pressure during evolution processes (Table 2). Clearly, segmental duplication played a more prominent role in the expansion of pepper GRAS genes than tandem duplication.
|Gene pairs||Ka||Ks||Ka/Ks||Duplication Type||Selection Type|
|CaGRAS4 vs. CaGRAS5||0.5433||2.9882||0.1818||Tandem||Purifying|
|CaGRAS20 vs. CaGRAS21||0.3469||3.0958||0.1121||Tandem||Purifying|
|CaGRAS13 vs. CaGRAS24||0.2277||1.2418||0.1834||Segmental||Purifying|
|CaGRAS18 vs. CaGRAS22||0.1727||0.6141||0.2813||Segmental||Purifying|
|CaGRAS18 vs. CaGRAS32||0.4234||2.6272||0.1611||Segmental||Purifying|
|CaGRAS18 vs. CaGRAS40||0.1706||0.6532||0.2612||Segmental||Purifying|
|CaGRAS22 vs. CaGRAS40||0.1332||0.6816||0.1955||Segmental||Purifying|
|CaGRAS22 vs. CaGRAS32||0.3849||3.4016||0.1132||Segmental||Purifying|
|CaGRAS23 vs. CaGRAS43||0.6308||11.5851||0.0545||Segmental||Purifying|
|CaGRAS25 vs. CaGRAS35||0.4549||3.1434||0.1447||Segmental||Purifying|
|CaGRAS28 vs. CaGRAS32||0.2969||3.0958||0.0959||Segmental||Purifying|
|CaGRAS32 vs. CaGRAS40||0.4094||3.8387||0.1066||Segmental||Purifying|
Ka indicates nonsynonymous substitution rate, and Ks indicates synonymous substitution rate.
In order to understand the phylogenesis of GRAS gene family, microsynteny analysis was employed for the precise region containing GRAS genes in pepper, tomato and Arabidopsis. (Fig. 3). A total of 37, 15 and seven orthologous gene pairs were identified in the cross of pepper and tomato, pepper and Arabidopsis, tomato and Arabidopsis, respectively. Several such regions were also found between different chromosomes in pepper. These data indicate that GRAS gene family is highly conserved, and pepper CaGRAS genes are more closely to those of tomato than that in Arabidopsis. The GRAS genes with microsynteny may evolve from the same ancestor.
Phylogenetic analysis, classification and functional characterization of CaGRAS family
To uncover the evolutionary relationships among CaGRAS proteins and their classifications, we performed a phylogenetic analysis using 189 full-length GRAS proteins (32 from Arabidopsis, 56 from rice, 51 from tomato and 50 from pepper). An unrooted phylogenetic tree was constructed (Fig. 4), demonstrating that all of these GRAS proteins could be classified into eleven distinct subfamilies based on clade support values and classification from Arabidopsis and rice. They were termed as DELLA, PAT1, SCL3, SHR, SCR, LISCL, HAM, LAS, DLT, Ca_GRAS and Os4, respectively. Of these, nine were named specifically according to the findings of previous study (Tian et al., 2004). The remaining two were named by the members of species origin. For example, the subfamily Ca_GRAS consisted of six GRAS members from tomato and six GRAS proteins from pepper, indicating that it might be a Solanaceae-specific group. Os4, a rice-specific subfamily, only contained eleven GRAS members from rice.
Generally, the genes clustered into a group tend to possess similar function and structure. Therefore, we could predict the potential function of CaGRAS members based on Arabidopsis or rice homologues in the same branch. For example, within the PAT1 subfamily, AtPAT1 and AtSCL13 were previously shown to be involved in phyA and phyB signaling pathway, respectively. Hence, we inferred that CaGRAS28 and CaGRAS18 in the same subfamily may also play a significant role in phytochrome signal transduction. DELLA subfamily included two CaGRAS members (CsGRAS14 and CaGRAS41), five AtGRAS and six OsGRAS. All these members contain the complete DELLA and TVHYNPS motifs (Fig. 5). Previous studies reported that DELLA proteins mainly regulate GA signal transduction pathway (Zhang et al., 2011) implying that CaGRAS14 and CaGRAS41 may have the similar role. The SCL3 subfamily consisted of two CaGRAS members (CaGRAS2 and CaGRAS44) and one AtGRAS (AtSCL3). This subfamily may mediate GA homeostasis through integrating other signals, because AtSCL3 was found to regulate root cell elongation by integrating multiple signals in Arabidopsis (Zhang et al., 2011). For subfamilies SHR and SCR, AtSHR and AtSCR were detected to function importantly in maintaining stem cell and root meristem. It is reasonable to predict that those pepper GRAS homologs in these two subfamilies may possess the similar functions (Di Laurenzio et al., 1996). LISCL subfamily consisted of nine CaGRAS, six AtGRAS and three OsGRAS members, and their biological roles are mostly unknown although a homolog member (LISCL) from Lilium longiflorum was proven to play an important regulatory role during microsporogenesis (Morohashi et al., 2003). The first HAM gene in the HAM subfamily was isolated from petunia and proved to promote shoot indeterminacy (Stuurman, Jaggi & Kuhlemeier, 2002). CaGRAS3 in the HAM subfamily was also demonstrated to be involved in shoot apical meristem organization and axillary meristem development (David-Schwartz et al., 2013). The LAS subfamily comprised two members from pepper, three from rice and three from Arabidopsis. AtLAS proteins in this subfamily mainly function to regulate and promote the initiation of axillary meristems (Liang et al., 2014). The DLT subfamily, the smallest group, contained six members (one from pepper, one from Arabidopsis, two from rice and two from tomato). The members of this group have been previously shown to participate in brassinosteroid signal pathway responsible for the plant height (Tong et al., 2009). For the Ca_GRAS subfamily having six CaGRAS and six SlGRAS members, no Arabidopsis and rice GRAS homolog was grouped into this subfamily, indicating that these genes may be Solanaceae-specific. The function of this subfamily awaits further exploration.
To investigate the common feature of pepper GRAS proteins in more detail, we used MEME suite to identify their conserved motifs and sequence logos. A total of 11 conserved motifs (named Motif 1–11) were identified, with more motifs locating at C-terminus than at N-terminus. Moreover, the motifs from the same subfamily nearly hold the similar patterns (Fig. 5). We then matched up the motifs with corresponding GRAS domain. It was found that Motif 10 and 4 is in LHRI domain at N-terminus, followed by Motif 7 and 1 in VHIID domain, Motif 6 and 8 in LHRII domain, Motif 9, 3 and 11 in PFYRE domain, and Motif 2 and 5 in SAW domain at C-terminus (Fig. 5). Of the 10 subfamilies of CaGRAS, members from PAT1 and LISCL subfamilies all contained the 11 conserved motifs identified.
Prediction of CaGRAS protein–protein interaction network
Due to unavailable reference for pepper interactome data, we predicted the protein–protein interaction relationships of CaGRAS members based on the interologs from Arabidopsis. We only obtained the interaction information for 19 CaGRAS proteins, and generated a complex interaction network using these proteins (Fig. 6). In general, the members from the SCL3 subfamily (CaGRAS2 and CaGRAS44) owned more interaction partners than others. These were consistent with their working mechanisms, considering the fact that AtSCL3 protein could regulate GA homeostasis by integrating other signal pathway, although such a relationship needs to be confirmed (Zhang et al., 2011). CaGRAS33, a member of LAS subfamily, directly interacted with nine CaGRAS members, while CaGRAS7 from the same subfamily only had three interaction partners. Surprisingly, no interaction partner was detected for CaGRAS proteins from DELLA and DLT subfamily. Our interaction networks may provide important clues for understanding the functions of unknown proteins.
Expression analysis of CaGRAS genes in various tissues and fruit developmental stages
We used online available transcriptome data of three tissues (leaf, stem and root) and seven developmental stages of pericarp and placenta (mature green, breaker, five and 10 days post-breaker, six, 16, 25 days post-anthesis) to investigate the expression patterns of pepper GRAS genes (Fig. 7). The RPKM value for each of those CaGRAS genes was listed in Table S2. The transcripts for the other 12 CaGRAS genes were not detected in any tissues (RPKM < 0.001), which may be the result of pseudogenes. Generally, 25 CaGRAS genes were detected to express in all tissues, with only five members (CaGRAS8, CaGRAS16, CaGRAS29, CaGRAS38 and CaGRAS48) showing high expression levels (PPKM > 10). A number of CaGRAS genes exhibited a certain degree of tissue specificity. For example, CaGRAS18 and CaGRAS27 were only expressed in pepper pericarp. CaGRAS35 and CaGRAS43 were highly expressed in leaf while the transcripts of CaGRAS30 and CaGRAS34 largely accumulated in stem rather than in other tissues. Tissue-specific expression of these genes showed that they may highly participate in the corresponding tissue development. CaGRAS28 homologous with AtPAT1 showed high expression level in leaves, which is in line with AtPAT1 function as a positive regulator in phyA signal pathway (Bolle, Koncz & Chua, 2000). Several CaGRAS genes exhibited constitutive expression levels at most stages of pericarp development. For example, CaGRAS7 and CaGRAS42 displayed a relatively higher expression at green fruit stage (PC_6DPA and PC_16DPA), and then decreased gradually towards fruit ripening. This expression pattern implied that CaGRAS7 and CaGRAS42 may function importantly in the early fruit development. In addition, the similar expression patterns were often detected for gene pairs from duplication event, but not for all such genes. For instance, in the CaGRAS18/40 duplicated region, CaGRAS40 was highly expressed, whereas the other showed the opposite expression pattern. These differences implied that duplicated GRAS gene pairs may have diverged evolutionary outcomes.
Response of CaGRAS genes to different stress treatments
In order to elucidate the functions of CaGRAS genes responsive to GA stimuli, qRT-PCR was performed to examine the expression of such genes in seedling leaves after treatment with GA. In this study, 14 CaGRAS genes showed obvious changes in response to GA treatment (Fig. 8). Of them, the most upregulated gene was CaGRAS37, while the most downregulated gene was CaGRAS10. To broaden our knowledge regarding how these genes are affected by GA, we conducted a comprehensive analysis on cis-elements in the promoter regions of such 14 CaGRAS genes using PlantCARE (Lescot et al., 2002). Additionally, 12 CaGRAS genes were detected to contain at least one GA responsive element (GARE) in their promoter sequences, again confirming the function of these genes in mediating GA signal pathway in pepper (Table S4).
We further examined the expression levels of CaGRAS genes under abiotic stresses, including salt, drought and cold treatments. Compared to the control group, the expression of 12 CaGRAS genes were highly affected by these treatments, indicating that those genes may have diverse functions involving in plant responses to abiotic stresses. The downregulated expression was detected for six, two and four CaGRAS genes, respectively, under cold, drought and salt stresses. Furthermore, we found the upregulated genes exhibit a group-specific expression. For example, the expression of CaGRAS genes from the DELLA subfamily was significantly induced under cold stress. The genes in the SCL3 subfamily was highly upregulated under drought stress, and the genes in the PAT1 subfamily were highly induced by GA and other four stress treatments. Therefore, it is possible that different CaGRAS members function in different stress responses.
With the rapid development of bioinformatics, information stored in genome sequence is increasingly to become the target to explore the mechanism underlying plant growth and development. Recent studies in a number of higher plants by comparative genomics showed that GRAS transcription factors play significant roles in multiple biological processes (Huang et al., 2015; Lee et al., 2008; Wu et al., 2015; Xu et al., 2016). However, limited knowledge was available for GRAS genes in pepper. In this study, we conducted a systematic analysis on this important transcription factor family in pepper, including genome-wide identification of CaGRAS members, chromosomal localization, intron-exon structure, physical-chemical features, phylogenetic analysis, duplication events, microsyntenic mapping and expression profiles in various pepper tissues as well as their responses to different stresses.
A total of 50 CaGRAS genes were obtained from 34,903 protein-coding genes in pepper genome. The number of CaGRAS genes is actually more than that in Arabidopsis (32), P. mume (45), castor bean (46) (Lee et al., 2008; Lu et al., 2015; Xu et al., 2016), comparable to cabbage (48) and tomato (53) (Huang et al., 2015; Song et al., 2014), but less than those in rice (60) and Populus (106) (Tian et al., 2004). The variation of GRAS gene number might be related to gene duplication events or genome size. This study detected two pairs of tandem duplicated CaGRAS genes and 10 pairs of segmental duplicated CaGRAS genes. However, 15 SlGRAS members were identified as tandem duplications in tomato. It looks like that segmental duplication contribute more to pepper GRAS expansion than tandem duplication whereas tandem duplication may be major player in this regard for tomato. Moreover, pepper genome size (3.48 Gb) is about fourfold larger than tomato genome (900 Mb), indicating that expansion mechanisms of GRAS genes are different among lineages.
All 50 CaGRAS proteins were classified into 10 subfamilies according to their conserved domains and sequence homology in Arabidopsis and rice (Tian et al., 2004). Notably, we observed a Solanaceae-specific subfamily (Ca_GRAS) contained the members from pepper and tomato but no Arabidopsis and rice GRAS homolog, whereas a rice-specific subfamily (Os4) was not detected in Arabidopsis, tomato and pepper. In agreement of this, the species-specific GRAS subfamily also widely existed in other plant species, such as the Rc_GRAS subfamily in castor bean (Xu et al., 2016) and the Pt20 subfamily in Populus (Liu & Widmer, 2014). These species-specific GRAS genes might be lost from other plants or become highly specialized during evolution.
The categorization of CaGRAS family was further supported by analysis of conserved motifs in those pepper proteins. Conserved motifs were found within the GRAS domain regions which might function importantly. Although conserved motifs were identical among all CaGRAS proteins, a number of differences in chemical–physical characteristics were also detected for CaGRAS members. These differences may due to the amino acid discrepancies in the non-conserved regions of CaGRAS members, implying that different CaGRAS proteins may act different functions in their own microenvironments (Huang et al., 2015).
Another important finding is that most CaGRAS genes (84%) contain just one exon. The high percentage of such intronless GRAS genes is detected as 67.6%, 54.7%, 82.2% and 83.3% in Arabidopsis, Populus, P. mume and Chinese cabbage (Lee et al., 2008; Lu et al., 2015; Song et al., 2014; Tian et al., 2004), respectively, evidencing again that the GRAS proteins were highly conserved among those plant species. Besides GRAS genes, intronless genes were also enriched among some other gene families, such as SAUR genes, F-box gene families and DEAD box helicases (Aubourg, Kreis & Lecharny, 1999; Jain et al., 2007; Jain, Tyagi & Khurana, 2006). Given the fact that intronless genes are archetypical in prokaryotic genomes, the recent work by Zhang, Iyer & Aravind (2012) showed that the origin of plant GRAS genes is derived from the prokaryotic genomes by horizontal gene transfer, followed by duplication events in evolutionary history. This may explain the formation of substantial intronless GRAS genes in pepper genome.
Generally, an intrinsically disordered region (IDR) in an intrinsically disordered protein (IDP) allows protein to recognize and interact with various partners, which are crucial for molecular function. Bioinformatics analysis showed that GRAS protein is a kind of IDP (Sun et al., 2013). One of a typical IDR in GRAS protein is its highly variable N-terminus, which possess short interaction-prone segments and molecular recognition features responsible for recognizing and binding the specific partner of GRAS proteins. Here, pepper GRAS proteins were found to contain a highly variable N-terminal region, which is consistent with the notion that N-terminus of GRAS proteins were intrinsically disordered, contributing to the functional divergence of CaGRAS proteins.
For functional characterization of those CaGRAS genes, we performed an extensive analysis for their expression profiles in different tissues and stress conditions, particularly for those in pepper-specific subfamily without function information deduced from model plant Arabidopsis or rice. Our data showed that CaGRAS4 may be a pseudogene because of no expression level detected in any tissues. CaGRAS5 might be involved in pericarp and placenta development, showing a relatively high abundance during all consecutive development stages. On the whole, the expression profiles of CaGRAS genes varied greatly not only among different tissues, but members from the same subfamily. Likely, such a great expression variation was also observed for GRAS genes in Populus and P. mume (Huang et al., 2015). These results indicated that GRAS genes may have experienced neo-functionalization or sub-functionalization in many higher plants. The RPKM values of twelve CaGRAS genes from seven subfamilies (DELLA, PAT1, SHR, SCR, LISCL, LAS and Ca_GRAS) were not detected in any tissues, suggesting these genes may lose their functions during evolution. By contrast, higher expression levels of CaGRAS genes in several organs signified their important roles. For example, CaGRAS29 from the SHR subfamily was highly transcribed in root tissue, which is consistent with the function of its homologous AtSHR responsible for root development (Cui et al., 2007). CaGRAS41 from the DELLA subfamily expressed in all tissues played critical roles in controlling a variety of signal hubs, whereas no expression of CaGRAS2 from the same subfamily was detected in any tissues. It seems that functional diversification is occurred for the two CaGRAS genes from the DELLA subfamily. Overall, the current expression data obtained for CaGRAS genes in different tissues lay a foundation for further functional analysis of pepper GRAS members.
In general, hormones could regulate plant growth and development via the modulation of the related gene expression. GA is found to play important roles in many aspects of plant development such as organ elongation, germination and flowering time. It has been reported that expression of GRAS genes in tomato showed dose-dependent response to GA (Huang et al., 2017). Our results demonstrated that the majority of CaGRAS genes detected here displayed dramatic changes after GA treatment. The promoters of these CaGRAS genes contained at least one GARE, implying that a set of CaGRAS proteins could regulate plant adaptability to adversity through a complex regulatory network. Additionally, previous studies revealed that GRAS genes could affect plant responses to abiotic stresses. For example, BnLAS and PeSCL7, GRAS members from Brassica napus and poplar, were identified as the good targets for engineering to increase plant drought and salt tolerance (Ma et al., 2010; Yang et al., 2011). Combined analysis of all qPCR results revealed that several pepper GRAS genes were associated with the above three stress responses (cold, salt and drought), showing the cross-talking of GRAS genes in regulation of plant responses against various adversity. Notably, we found that CaGRAS members belonging to PAT1 group exhibit the similar expression patterns when stressed by GA and other abiotic treatments. Consistently, OsGRAS genes from rice PAT1 group were also reported to be involved in GA and stress responses. All these indicate that some GRAS genes may specifically coordinate plant responses to multiple stresses.
In this study, 50 CaGRAS members were characterized from pepper genome, and classified into 10 subfamilies based on phylogenetic relationships. Duplication event particularly segmental duplication was identified as the main driving force to GRAS gene expansion in pepper. Interaction network and expression profiles among CaGRAS genes were examined, illustrating important roles of CaGRAS proteins in regulating GA and abiotic stress responses. Taken together, our study is the first comprehensive characterization of GRAS genes in pepper. All these data provide the foundation to elucidate the GRAS-mediated molecular mechanism underlying plant growth and development as well as stress biology, showing that GRAS members could be selected as the targets for genetic improvement of stress tolerance in pepper and other related plants.