- Research article
- Open Access
De novo assembly and transcriptome characterization: novel insights into catechins biosynthesis in Camellia sinensis
BMC Plant Biology volume 14, Article number: 277 (2014)
Tea is a popular natural non-alcoholic beverage consumed worldwide due to its bioactive ingredients, particularly catechins (flavan-3-ols). Catechins not only contribute to tea quality but also serve important functions in the anti-stress regulation of secondary metabolic pathways. However, the percentages of various catechins are different among tea plant [Camellia sinensis (L.) O. Kuntze] cultivars. This study aimed to elucidate the biosynthetic mechanism of catechins. Transcriptomes from leaf tissues of four tea plant cultivars, ‘Yunnanshilixiang’, ‘Chawansanhao’, ‘Ruchengmaoyecha’, and ‘Anjibaicha’, were sequenced using the high-throughput sequencing platform Illumina HiSeq™ 2000. De novo assemble were also performed. Catechins contents were measured through reversed-phase high-performance liquid chromatography (RP-HPLC), and the biosynthetic pathway was also surveyed.
We constructed a unified unigene database. A total of 146,342 pairs of putative orthologs from the four tea plant cultivars, ‘Yunnanshilixiang’, ‘Chawansanhao’, ‘Ruchengmaoyecha’, and ‘Anjibaicha’ were generated. Approximately 68,890 unigenes (47.1%) were aligned to the sequences of seven public databases with a cut-off E-value of 1E-5. A total of 217 differentially expressed genes were found through RPKM values, and 150 unigenes were assigned to the flavonoid biosynthetic pathway using the integrated function annotation. The (−)-EGC and (−)-EC contents were significantly lower and the (+)-GC and (+)-C contents were abnormally higher in ‘Ruchengmaoyecha’ than in ‘Yunnanshilixiang’, ‘Chawansanhao’, and ‘Anjibaicha’. The proportion of catechins was confirmed by selecting critical genes (ANS, ANR, and LAR) for qRT-PCR analysis.
This study provided a global survey of transcriptomes from four tea plant cultivars and serves as an available resource of genetic diversity. The analyses of transcriptome profiles and physiological indicators not only identified the putative genes involved in the flavonoid biosynthetic pathway but also provided some novel insights for the mechanisms of catechins biosynthesis.
The tea plant, Camellia sinensis (L.) O. Kuntze, is naturally distributed in the Southeast Asia Monsoon region and has been cultivated in China as a commercially valuable plant for at least 2000 years ,. Tea is made from tea plant leaves and is consumed as a popular natural non-alcoholic beverage worldwide due to its bioactive ingredients, including tea polyphenols , theanine , and polysaccharides . Numerous reports revealed that tea prevents cancer, cardiovascular, neurodegenerative, and other oxidative stress-related diseases -. Green tea, black tea or tea constituents have been shown to inhibit the development of cancer in animal models, such as lung tumorigenesis in A/J mice , and intestinal tumorigenesis in Apcmin/+ mice . Population studies suggested that green and black tea consumption could reduce the risk for cardiovascular disease . The potent antioxidant and iron chelating actions of tea extracts were shown to attenuate the neurotoxic action of 6-hydroxydopamine (6-OHDA)-induced neuronal death . Moreover, tea also has been shown to prevent skin aging, liver cell injury and inflammation -. The benefits of tea are mainly attributed to catechins (flavan-3-ols), a group of polyphenolic compounds . As tea principal flavor substances, catechins usually account for 25% to 30% of the dry weight of fresh tea plant leaves ,. The accumulation of catechins in shoots may be related to energy storage and stress resistance -.
The catechins in fresh tea leaves are usually classified into seven groups: (+)-gallocatechin [(+)-GC], (−)-epigallocatechin [(−)-EGC], (−)-epicatechin [(−)-EC], (+)-catechin [(+)-C], (−)-epigallocatechin gallate [(−)-EGCG], (+)-gallocatechin gallate [(+)-GCG], and (−)-epicatechingallate [(−)-ECG]. The flavonoid biosynthetic pathway of C. sinensis has been identified by numerous physiological, biochemical, and genetic studies -. However, the molecular mechanisms of (−)-EGCG, (+)-GCG, and (−)-ECG remain unclear to date. Catechins such as (−)-EGC, (−)-EC, (+)-GC, and (+)-C are synthesized through the enzymatic catalysis of anthocyanidin synthase (ANS), leucoanthocyanidin reductase (LAR), and anthocyanidin reductase (ANR) in the late stage of flavonoid biosynthesis . (−)-EGCG and (−)-ECG may be biosynthesized by a newly discovered enzyme (epicatechin:1-O-galloyl-β-D-glucose O-galloyltransferase) . The genes that encode these enzymes have been cloned or verified from C. sinensis, but information on their regulatory mechanisms remains lacking. The tea plant has a large genome ,. Compared to other sequenced model plants, the genome size of tea plant (a perennial woody plant, ~4,000 Mb) is about 32.0, 9.3, 8.4, 8.2 times than that of two annual herbaceous model plants, Arabidopsis thaliana (125 Mb)  and rice (Oryza sativa, 430 Mb) ,, and two perennial woody model plants, grapevine (Vitis vinifera, 487 Mb) , and black cottonwood poplar (Populus trichocarpa, 485 Mb) . Some genes are involved in flavonoid biosynthesis, and almost all genes may have multiple copies. Only a few flavonoid biosynthetic genes in C. sinensis have been completely cloned and functionally identified. The catechins content of different tea plant cultivars are different from one another. However, the exact mechanism responsible for this difference remains unclear.
This study elucidated the mechanisms and critical genes that regulate catechins biosynthesis. Transcriptomes of four tea plant cultivars from different provinces in China were sequenced using the high-throughput sequencing platform Illumina HiSeq™ 2000 and were de novo assembled. The tea plant samples used here included mid-leaf ‘Yunnanshilixiang’ (Tea_T1) from Yunnan province, small-leaf ‘Chawansanhao’ (Tea_T2) from Jiangsu province, high-temperature-tolerant large-leaf ‘Ruchengmaoyecha’ (Tea_T3) from Hunan province, and low-temperature-sensitive small-leaf ‘Anjibaicha’ (Tea_T4) from Zhejiang province (Figure 1). Because of the obvious difference of geographic and climate characteristics in these four tea production areas, respectively plateau monsoon climate (Yunnan), coastal temperate climate (Jiangsu), inland subtropical monsoon climate (Hunan), and coastal subtropical monsoon climate (Zhejiang), the morphology and physiology of tea plants of Tea_T1, Tea_T2, Tea_T3, and Tea_T4 were different, such as leaf size and environmental adaptability. The contents and component proportions of catechins compounds are one of the important factors of the characteristics of tea-processing suitability and quality ,. The tea plant cultivars of ‘Yunnanshilixiang’, ‘Chawansanhao’, and ‘Anjibaicha’ are suitable processed into green tea, however, ‘Ruchengmaoyecha’ is suitable for black tea.
The same set of reference genes was established to analyze the subsequent expression abundance and the reads per kb per million reads (RPKM) values of two libraries. A total of 217 differentially expressed unigenes were identified. Function annotation analysis showed that 150 unigenes are involved in flavonoid biosynthetic pathway. Seven genes (LAR-1, LAR-2, LAR-3, ANS-1, ANS-2, ANR-1, and ANR-2) that encode for three key enzymes (ANS, ANR, and LAR) had different expression patterns among the four tea plant cultivars, in which expression of six genes (LAR-2, LAR-3, ANS-1, ANS-2, ANR-1, and ANR-2) positively associated with the concentration of their corresponding catechins. Another gene LRA-1 may not be primarily responsible for the biosynthesis of catechins, replaced by the same set of genes LRA-2 and LRA-3. Our study may served as a reference for further studies on the multi-gene regulation of catechins biosynthesis in C. sinensis.
Sequencing and de novo assembly
Four cDNA libraries were constructed from fresh leaves RNA samples of Tea_T1, Tea_T2, Tea_T3, and Tea_T4. Approximately 25.7, 21.5, 20.7, and 27.1 million raw reads of 200 bp, and 5.1, 4.3, 4.2, and 5.4 Giga base pairs (Gbp) each, respectively, were generated by the Illumina HiSeq™ 2000 sequencing device. The Q20 values (sequencing error rate, 1%) were more than 93.75%, and the GC percentages were 44.41%, 46.65%, 51.08%, and 46.09%, respectively (Table 1). Adaptor sequences, duplicated sequences, ambiguous reads, and low-quality reads were removed, and the high-quality reads of each cultivar were separately de novo assembled using the Trinity program . The assembly finally produced 86,523 unigenes with the mean size of 591 bp for Tea_T1, 54,980 unigenes with the mean size of 601 bp for Tea_T2, 34,442 unigenes with the mean size of 530 bp for Tea_T3, and 74,894 unigenes with the mean size of 596 bp for Tea_T4 (Table 1).
The same set of reference genes (Tea.Unigene library) was analyzed to determine the subsequent expression abundance and differentially expressed genes. A total of 146,342 unigenes were obtained from the tea plant cultivars, with an average unigenes and N50 length were 526 bp and 648 bp, respectively (Table 2). The length distribution of the unigenes is shown in Figure 2.
Functional annotation and categorization
All unique sequences were annotated using BLASTX against the NCBI non-nucleotide (Nt) sequences database, NCBI non-redundant (Nr) protein database, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) , Clusters of Orthologous Groups (COG) , and UniProtKB/(Swiss-Prot, TrEMBL)  to annotate the assembly as comprehensively as possible. A total of 68,890 unigenes (47.07%) were identified with a significance threshold (E-value ≤1E-5). The remaining unigenes (52.93%) cannot be annotated with known genes (Table 3), which most likely caused by the presence of short sequences (44.36% <300) and the shortage of relevant genetic data.
E-value and species distribution were also analyzed by evaluating the matched unigenes (58,678) from the returned BLASTX results of the Nr protein database. Very strong homology was observed in 41.71% of the aligned sequences (E <1E-50), and 58.29% of the homolog sequences ranged from 1E-50 to 1E-5 (Figure 3A). The species distribution of the top hits that matched the sequences showed that Vitis vinifera (41.16%) had the greatest number of matches with C. sinensis, followed by Populus trichocarpa (9.23%), Ricinus communis (7.84%), Arabidopsis thaliana (4.32%), Glycine max (3.67%), Arabidopsis lyrata (3.25%), Medicago truncatula (2.10%), Oryza sativa Japonica Group (0.95%) and Hordeum vulgare (0.88%) (Figure 3B).
The expressed C. sinensis genes were searched against the GO database to categorize standardized gene functions. Of the 58,678 unigenes previously annotated to the NR database, 50,846 were assigned to three main GO categories (biological process, cellular component, and molecular function) and 64 subcategories using the Blast2GO and WEGO software (Figure 4). A total of 14,278 GO terms were collected, which were most frequently related to biological processes (9,106), followed by molecular function (3,843), and cellular components (1,329).
The major subcategories (above 25% genes) among the biological processes were “cellular process” (78.74%, 40,036), “metabolic process” (75.78%, 38,532), “response to stimuli” (50.38%, 25,615), “biological regulation” (49.66%, 25,248), “developmental process” (38.08%, 19,362), “cellular component organization or biogenesis” (36.68%, 18,649), “localization” (32.11%, 16,326), “multicellular organismal process” (31.90%, 16,219), “establishment of localization” (29.18%, 14,838), “reproduction” (23.12%, 11,756), and “reproductive process” (22.73%, 11,556). “Binding” (69.38%, 35,276) and “catalytic activity” (55.20%, 28,068) were the dominant molecular functions. The most highly represented cellular component was “cell part” (82.15%, 41,770), followed by “cell” (81.26%, 41,320), “organelle” (73.86%, 37,553), “membrane” (41.36%, 21,031), “organelle part” (36.49%, 18,553), and “macromolecular complex” (21.02%, 10,686).
COG was used to further evaluate the completeness of the tea plant transcriptome libraries and the validity of the annotation. A total of 17,028 unigenes were clustered into 25 functional categories (Table 3). The largest category was “General function prediction only” (25.52%, 4,345), followed by “replication, recombination and repair” (13.88%, 2,363), “transcription” (12.38%, 2,108), “translation, ribosomal structure and biogenesis” (11.35%, 1,933), “Signal transduction mechanisms” (10.14%, 1,726), “Posttranslational modification, protein turnover, chaperones” (9.95%, 1,695), “Carbohydrate transport and metabolism” (7.32%, 1,246), “Energy production and metabolism” (6.62%, 1,127), and “Amino acid transport and metabolism” (6.60%, 1,124). “Extracellular structure”, “nuclear structure”, and “cell motility” had the fewest unigenes (Figure 5).
To further explore the gene interactions and biological functions in C. sinensis leaves, the unigenes were searched against the reference canonical pathways in KEGG. A total of 15,300 unigenes were annotated with their corresponding Enzyme Commission (EC) numbers and were assigned to 251 KEGG pathways (Table 3, see Additional file 1). The most representative pathways were “ribosome” (ko03010, 856, 5.59%), “oxidative phosphorylation” (ko00190, 564, 3.69%), “protein processing in endoplasmic reticulum” (ko04141, 522, 3.41%), “RNA transport” (ko03013, 507, 3.31%), “spliceosome” (ko03040, 473, 3.09%), “purine metabolism” (ko00230, 355, 2.32%), “endocytosis” (ko04144, 325, 2.12%), “ubiquitin-mediated proteolysis” (ko04120, 311, 2.03%), “glycolysis/gluconeogenesis” (ko00010, 308, 2.01%), “starch and sucrose metabolism” (ko00500, 307, 2.01%), “RNA degradation” (ko03018, 299, 1.95%), “plant hormone signal transduction” (ko04075, 299, 1.95%), “mRNA surveillance pathway” (ko03015, 293, 1.92%), “pyrimidine metabolism” (ko00240, 282, 1.84%), “ribosome biogenesis in eukaryotes” (ko03008, 280, 1.83%), “phagosome” (ko04145, 259, 1.69%), and “cysteine and methionine metabolism “(ko00270, 244, 1.59%) (Figure 6A).
This study focused on the “Biosynthesis of other secondary metabolites” pathway present in C. sinensis leaves and revealed 140 unigenes for “Phenylpropanoid biosynthesis”, 38 unigenes for “Stilbenoid, diarylheptanoid, and gingerol biosynthesis”, 43 unigenes for “Flavonoid biosynthesis”, 35 unigenes for “Tropane, piperidine, and pyridine alkaloid biosynthesis”, 31 unigenes for “Isoquinoline alkaloid biosynthesis”, 6 unigenes for “Flavone and flavonol biosynthesis”, 3 unigenes for “Novobiocin biosynthesis”, 2 unigenes for “Caffeine metabolism”, 1 unigene for “Indole alkaloid biosynthesis”, and 1 unigene for “Streptomycin biosynthesis” (Figure 6B).
Differential gene expression in the tea plant libraries
The expression abundance of each sample was measured, and differentially expressed genes (DEGs) were found between the two libraries. Clean reads from each sample were mapped back to the above-constructed reference genes, and the mapped reads were counted to obtain RPKM values for evaluation. A total of 273 DEGs were detected among the four C. sinensis libraries, of which 106, 25, 39, 64, and 39 DEGs were predicted from “Tea_T1_vs_Tea_T2”, “Tea_T1_vs_Tea_T3”, “Tea_T1_vs_Tea_T4”, “Tea_T2_vs_Tea_T3”, and “Tea_T2_vs_Tea_T4”, respectively, no DEGs were found in “Tea_T3_vs_Tea_T4” (Figure 7). Overlapping genes were removed, and 217 DEGs were obtained and hierarchically clustered. The gene expression profiles are shown in a heat map (Figure 8). The enriched genetic annotation for DEGs was analyzed, and the COG, GO, KEGG, Swissprot, TrEMBL, Nr, and Nt databases were annotated to describe the functions and metabolism of the genes compared with the transcriptome database (P ≤0.05, hypergeometric test). The detailed results are given in Additional file 2.
Genes involved in the flavonoid biosynthetic pathway based on tea plant leaf transcriptome
Catechins are the main ingredient of flavonoids, which are not only important for tea quality but also related to the growth and metabolism of tea plant. The catechins pathways in the four tea plant cultivars were analyzed in this study (Figure 9). A total of 150 unigenes involved in flavonoid biosynthesis were annotated and found to encode 18 putative enzymes from integrated function annotation (COG, GO, KEGG, Swissprot, TrEMBL, Nr, and Nt annotation). These genes potentially related to catechins biosynthesis are detailed in Additional file 3.
Catechins contents among the four tea plant cultivars detected through RP-HPLC
RP-HPLC was performed to separate and detect catechins with high accuracy and sensitivity. The polyphenols extracted from the four parts of dry tea leaves were used for testing (Figure 10). The contents and proportions of the four components [(−)-EGC, (−)-EC, (+)-GC, and (+)-C] were highly similar in three tea plant cultivars, namely, Tea_T1, Tea_T2, and Tea_T4. The (−)-EGC and (−)-EC contents were always higher than the (+)-GC and (+)-C contents in these three tea plant cultivars. The leaves of these cultivars were similar in type and were small and medium in size. However, these detected indicators of large-leaf Tea_T3 were completely in contrast with the rest of the cultivars (Tea_T1, Tea_T2, and Tea_T4). The (−)-EGC and (−)-EC contents were obviously lower than the (+)-GC and (+)-C contents in Tea_T3 (Figure 11).
Expression profiles of the genes involved in catechins biosynthesis in tea plant
RPKM values were used to analyze the expression of 150 unigenes involved in flavonoid biosynthesis in the four tea plant cultivars to evaluate the catechins production capacity of the tea plants (Additional file 3). The unigenes of the most and the least expression levels were taken as up-regulated and down-regulated genes, respectively. We identified 38 up-regulated and 8 down-regulated unigenes in Tea_T1, 40 up-regulated and 12 down-regulated unigenes in Tea_T2, 26 up-regulated and 64 down-regulated unigenes in Tea_T3, and 44 up-regulated and 12 down-regulated unigenes in Tea_T4. The digital expression profiles revealed a different expression pattern in Tea_T3.
Seven long unigene fragments that encode for three enzymes (ANS, ANR, and LAR) at the stage of flavonoid biosynthesis were selected for verification through qRT-PCR analysis (Figure 12). The three enzymes dominated catechins production in tea plant. The results showed that the expression profiles of the seven unigenes from Tea_T3 differed from those of the unigenes from other cultivars and those five unigenes almost exactly coincided with predictable results. Moreover, six unigenes, LAR-2, LAR-3, ANS-1, ANS-2, ANR-1, and ANR-2, had strong positive correlations with corresponding catechins concentration compared with the expression profiles and the previously measured catechins contents. This result suggests that the difference in the gene expression profiles of Tea_T3 may have caused catechins diversity.
C. sinensis is a perennial cross-pollination plants that has rich genetic diversity of populations ,. Phenotypic diversity of C. sinensis mainly embodied in plant height, flower, leaf size and locules number. Catechins are special accumulation in tea plant leaves, and their contents also have high variability in different tea plant cultivars . Multi-species transcriptome sequencing provided good tool to understand the complex transcriptional regulation and metabolic pathways of different catechins components. The Illumina HiSeq™ 2000 sequencing platform was used because of its higher throughput, lower cost and greater output than Illumina Solexa Genmoe Analyzer -. Multiple samples can also be simultaneously sequenced by barcoding (multiplexing). A total of 146,342 unigenes were obtained from the tea leaf transcriptomes of the four cultivars; this number is higher than that reported from the leaves of another tea variety (25,637 unigenes) . This number is between the sequencing assembly results from mixed tissues (127,094 unigenes)  and from cold acclimated leaves (179,753 unigenes)  of C. sinensis. The transcriptome of mixed tissues may not reach saturation by mapping it to the transcriptome of cold acclimated leaves . In addition, leaves acclimated to cold temperatures express more genes than normal ,. These findings indicated that the information of this study on C. sinensis leaves were relatively comprehensive. It is noteworthy that the data sizes of the sequences from the four samples were different. The smallest data size only had 34,442 unigenes from Tea_T3, but the largest data size had 86,523 unigenes from Tea_T1. Therefore, each of the four copies of data independently contributed to the construction of the Tea.Unigene library.
A total of 68,890 (47.07%) of the 146,342 unigenes from the Tea.Unigene library were annotated to public databases (GO, COG, KEGG, Swissprot, TrEMBL, Nr, and Nt) for comprehensive analysis. Previous studies only included 55,088 annotated unigenes from 127,094 unigenes , 22,872 annotated unigenes from 25,637 unigenes , and 53,201 annotated unigenes from 179,753 unigenes . Compared with these studies, the present study obtained more complete annotation information. The annotations in this study were compared with the GO annotation from Shi et al. , and the principal difference was found between annotations from “response to stimulus of biological processes”, “membrane of cellular component”, and “nucleic acid binding transcription factor activity of molecular function”, this result indicated that the leaves were prominent at the molecular level in response to stimulation and membrane metabolism. The COG annotation was also compared with previous studies on C. sinensis,. Results showed that the unigene expression profiles under “the categories of posttranslational modification”, “protein turnover”, and “chaperones” did not specifically appear in tea leaf tissue. The present results were similar to the findings of Shi et al.  rather than Wu et al. .
The abundance of gene expression is often used to explore gene expression profiles, and DEGs are found among different developmental stages, organizations, treatments, and species -. In the present study, 217 DEGs were identified and annotated from the four tea plant cultivars. The analysis results of orthologous groups of protein from COG function classification showed that the major differences among the four tea plant cultivars were “energy production and conversion”, “posttranslational modification”, “protein turnover”, “chaperones”, “general function prediction only and secondary metabolites biosynthesis”, “transport”, and “catabolism”. However, no DEGs were found for “RNA processing and modification”, “cell cycle control”, “cell division”, “chromosome partitioning”, “nuclear structure”, “defense mechanisms”, “extracellular structures and intracellular trafficking”, “secretion”, and “vesicular transport”. Overall, the cultivars still exhibited high molecular stability.
The four tea plant cultivars (Tea_T1 to T4) from different provinces differed in leaf size, plant morphology, and stress resistance. It is generally believed that the catechins contents were higher in large-leaf species than in small-leaf species of tea plant. However, this has not been validated at the molecular level. Many genes correlate with the concentrations of catechins in the flavonoid biosynthetic pathway of C. sinensis. These genes include PAL, C4H, F3H, and ANR. Almost all genes involved in catechins biosynthesis were also found in other species -. However, the expression of a particular gene does not necessarily mean a relationship with catechins contents because of the complexity of flavonoid biosynthesis and the existence of gene isomers.
Integrated functional annotation and further RPKM value analyses of the four tea plant cultivars showed that 150 unigenes were involved in the flavonoid biosynthetic pathway. Digital expression profiles revealed that different expression profile patterns may exist in Tea_T3. We further confirmed the expression variations and the relationship between the gene expression and catechins biosynthesis as follows. Three key enzyme genes (ANS, ANR, and LAR) were selected for qRT-PCR analysis, and four types of end products [(−)-EGC, (−)-EC, (+)-GC, and (+)-C] of the flavonoid biosynthetic pathway were selected for RP-HPLC detection. ANS to ANR are unique for the synthetic pathway of (−)-EC and (−)-EGC. The low expression levels of ANS-1, ANS-2, ANR-1, and ANR-2 in Tea_T3 can explain the low contents of (−)-EC and (−)-EGC in this cultivar. LAR is involved in the synthesis of (+)-C and (+)-GC, which significantly accumulated in Tea_T3. The LAR gene seems to be higher expression level in Tea_T3. One of them, LAR-1 actually had low expression in Tea_T3. These results revealed that the other two LAR genes (LRA-2 and LRA-3) or more may control (+)-C and (+)-GC generation in tea plant.
Catechins are the largest group of secondary metabolites in tea and are very important for processing suitability and quality ,. The (−)-EGCG content of catechins is the largest and next to this are (−)-EGC, (−)-ECG, (−)-EC in green tea, the (+)-C and (+)-GC contents of catechins are usually trace . However, (+)-GC is considered the most important catechin for sensory quality in black tea, (+)-C is correlated positively and significantly with various individual quality attributes and total quality scores . The contents and component proportions of catechins of C. sinensis mainly determined by the cultivars of tea plant and environmental conditions ,. In the present study, four tea plant cultivars from different origins were selected for RP-HPLC detection of catechins. The results showed that the (−)-EC and (−)-EGC contents of catechins in ‘Yunnanshilixiang’ (Tea_T1), ‘Chawansanhao’ (Tea_T2), and ‘Anjibaicha’ (Tea_T4) were higher than that in ‘Ruchengmaoyecha’ (Tea_T3); the (+)-C and (+)-GC contents of catechins in Tea_T3 were higher than that in the other three tea plant cultivars. It conformed to their processing characteristics of green tea or black tea. In addition, the relevance between three key structural genes (ANS, ANR, and LAR) and the diversity of catechins components in the four tea plant cultivars was confirmed through analyzing their expression profiles. This will help to explore tea-processing suitability at the molecular level and develop better germplasm resources of tea plants based on the genetic metabolic regulation of catechins.
This study provides a global survey of transcriptomes from four C. sinensis cultivars and thus may serve as an available genetic diversity resource for the tea plant. Analyses of transcriptome profiles and physiological indicators identified putative genes involved in the flavonoid biosynthetic pathway. Results showed that the multi-gene regulation of large-leafed catechins significantly differed relative to other cultivars. The expression levels of genes ANS, ANR, and LAR may cause differences in catechins components by comparing the expression profiles and catechins contents of the cultivars. This study provided novel insights into the mechanisms of catechins biosynthesis in tea leaves.
Plant material and RNA isolation
Five-year-old cutting tea plant seedlings of Tea_T1, Tea_T2, Tea_T3, and Tea_T4 were planted in a growth chamber at the Tea Science Research Institute, College of Horticulture, Nanjing Agricultural University (Nanjing, China). The plants were grown in acidic soil (pH 5.6), and the conditions were maintained at 23 ± 2°C temperature and 70 ± 10% relative humidity. Four young tea plant leaves were selected, quickly frozen in liquid nitrogen, and then stored at −80°C for RNA extraction.
RNA was extracted from the tea plants according to the instruction manual of the Quick RNA isolation Kit (Huayueyang Biotech Co., Ltd., Beijing, China). The extracted RNA was treated with RNase-free DNaseI (TaKaRa Biotech Co., Ltd., Dalian, China) to remove residual DNA. RNA integrity was checked through agarose gel electrophoresis (1.2%), and RNA concentration was estimated using an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA, USA).
Construction of cDNA library and illumina sequencing
High-quality RNA samples from tea plants were sent to Biomarker Technologies Corporation (Beijing, China) for cDNA libraries construction and sequencing. Magnetic oligo (dT) beads were used to enrich the poly (A) mRNA tails of four independent RNA. The enriched mRNA was fragmented into small pieces, which were prepared as templates for cDNA synthesis. Double-stranded cDNA was synthesized using SuperScript II, buffer, dNTPs, RNaseH, and DNA polymerase I. The cDNA was purified using a QiaQuick PCR extraction kit (Qiagen, Inc., Hilden, Germany) and was eluted with EB buffer. The short cDNA fragments were subjected to end repair, adapter ligation, and agarose gel electrophoresis filtration. Then, the suitable fragments were selected as templates for PCR amplification. The four constructed cDNA libraries of tea plant were sequenced using the Illumina HiSeq™ 2000 platform.
Data filtering and de novo assembly
High-quality clean reads were obtained by removing the adaptor sequences, duplicated sequences, ambiguous reads (‘N’), and low-quality reads. Transcriptomes from four datasets were separately assembled de novo using Trinity (http://trinityrnaseq.sourceforge.net/). In brief, clean reads with a certain overlap length were initially combined to form long fragments without N. These fragments are called contigs. Related contigs were clustered using the TGICL software  to yield unigenes (without N) that cannot be extended on either end, and redundancies were removed to acquire non-redundant unigenes.
Functional annotation of the assembled unigenes
The unigene sequences of the four tea plant cultivars were searched using BLASTX against the Nt, Nr, KEGG, GO, COG, Swiss-Prot, and TrEMBL databases (E-value ≤1E-5) to retrieve protein functional annotations based on sequence similarity. High-priority databases (followed by Nr, Swiss-Prot, and KEGG) were selected to determine the direction of the unigene sequences. The best aligning results were used to predict the coding region sequences from unigenes, and the coding sequences (CDSs) were translated into amino sequences using the standard codon table. The ESTScan software  was used to decide the sequence direction of the unigenes that could not be aligned to any of the above databases. GO terms were assigned to each sequence annotated by BLASTX against the Nr database using the Blast2GO program with the E-value threshold of 1E-5 for further functional categorization. The WEGO software  was used to plot the distribution of the GO functional classification of the unigenes. The unigene sequences were also aligned to the COG database to predict and classify possible functions. The unigenes were assigned to KEGG pathway annotations to analyze inner-cell metabolic pathways and the related gene function using BLASTX.
Analysis of the functional enrichment of DEGs
The RPKM method eliminated the influence of different gene lengths and sequencing levels on the calculation of gene expression. Therefore, RPKM values were directly used to compare gene expression differences between different samples. The DESeq package was used to obtain the “base mean” value for identifying DEGs. FDR ≤0.01 and the absolute value of log2 ratio ≥1 were set as the thresholds for the significance of the gene expression difference between the two samples.
Chromatographic conditions of RP-HPLC
The catechins contents of the four tea plant cultivars were analyzed through RP-HPLC according to GB/T8313-2008 (China). The samples were applied in the Shimadzu LC-20A series (Shimadzu Co., Kyoto, Japan). A Hedera ODS-2 C18 analytical column (250 mm × 4.6 mm i.d., 5 μm nominal particle size) was used for chromatographic separation. Gradient elution conditions were modified to better separate peaks. Double distilled water was used as mobile phase A, and primary mobile phase A was used as mobile phase B. For the gradient elution was at 2:3 mobile phase A and B ratio. The separation of the catechins was checked using a SPD-20A UV detector.
Validation of the digital expression profiles through quantitative real-time PCR
Seven genes that encode for three enzymes (ANS, ANR, and LAR) at the late stage of flavonoid biosynthetic pathway by KEGG were chosen for validation among the four tea plant cultivars using quantitative real-time PCR. Gene-specific primers were designed using the Primer Premier 5.0 software. QRT-PCR was performed on a Bio-Rad iQ5 real-time PCR platform (Bio-Rad Laboratories, Inc., Hercules, CA, USA) using SYBR® Premix Ex-Taq™ (Tli RNaseH Plus), ROX plus (TaKaRa Biotech Co., Ltd., Dalian, China) according to the manufacturer’s instructions. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was selected as the internal control gene for normalization as previously reported. The cDNAs were diluted 18-fold with nuclease-free deionized water, and 2 μL of each sample was extracted as template added to the reaction mixture (20 μL) containing 10 μL of SYBR® Premix Ex-Taq (2×) (Tli RNaseH Plus), ROX plus, 0.4 μL of (10 μM) each primer, and 7.2 μL of ddH2O. Thermal cycling was performed under the following conditions: 95°C for 30 s, 40 cycles at 95°C for 5 s, and 55°C for 20 s. Each reaction was performed in triplicate, in which the average threshold cycle was calculated to estimate the relative gene expression levels using the 2−ΔΔCt method . The data were expressed as the mean ± SD, and all primer information is listed in Table 4.
Availability of supporting data
The data supporting the results of this article are included within the article.
Reads per kb per million reads
Non-redundant nucleotide database
Non-redundant protein database
Annotated protein sequence database
Computer-annotated supplement to Swiss-Prot database
Kyoto encyclopedia of genes and genomes
Clusters of orthologous groups of protein
Differentially expressed genes
Reversed-phase high-performance liquid chromatography
Quantitative real-time PCR
Banerjee B: Botanical classification of tea. Tea: cultivation to consumption. Edited by: Willson KC, Clifford MN. Chapman and Hall, London; 1992:25-51. 10.1007/978-94-011-2326-6_2.
Chen YX, Yu MG, Xu J, Chen XC, Shi JY: Differentiation of eight tea (Camellia sinensis) cultivars in China by elemental fingerprint of their leaves. J Sci Food Agr. 2009, 89 (14): 2350-2355. 10.1002/jsfa.3716.
Yang CS, Lambert JD, Sang SM: Antioxidative and anti-carcinogenic activities of tea polyphenols. Arch Toxicol. 2009, 83 (1): 11-21. 10.1007/s00204-008-0372-0.
Bukowski JF, Percival SS: L-theanine intervention enhances human gamma delta T lymphocyte function. Nutr Rev. 2008, 66 (2): 96-102. 10.1111/j.1753-4887.2007.00013.x.
Chen XQ, Lin Z, Ye Y, Zhang R, Yin JF, Jiang YW, Wan HT: Suppression of diabetes in non-obese diabetic (NOD) mice by oral administration of water-soluble and alkali-soluble polysaccharide conjugates prepared from green tea. Carbohyd Polym. 2010, 82 (1): 28-33. 10.1016/j.carbpol.2010.04.017.
Bordoni A, Hrelia S, Angeloni C, Giordano E, Guarnieri C, Caldarera CM, Biagi PL: Green tea protection of hypoxia/reoxygenation injury in cultured cardiac cells. J Nutr Biochem. 2002, 13 (2): 103-111. 10.1016/S0955-2863(01)00203-0.
Levites Y, Youdim MBH, Maor G, Mandel S: Attenuation of 6-hydroxydopamine (6-OHDA)-induced nuclear factor-kappaB (NF-kappa B) activation and cell death by tea extracts in neuronal cultures. Biochem Pharmacol. 2002, 63 (1): 21-29. 10.1016/S0006-2952(01)00813-9.
Nie GJ, Jin CF, Cao YL, Shen SR, Zhao BL: Distinct effects of tea catechins on 6-hydroxydopamine-induced apoptosis in PC12 cells. Arch Biochem Biophys. 2002, 397 (1): 84-90. 10.1006/abbi.2001.2636.
Song DU, Jung YD, Chay KO, Chung MA, Lee KH, Yang SY, Shin BA, Ahn BW: Effect of drinking green tea on age-associated accumulation of Maillard-type fluorescence and carbonyl groups in rat aortic and skin collagen. Arch Biochem Biophys. 2002, 397 (2): 424-429. 10.1006/abbi.2001.2695.
Lambert JD, Hong J, Yang GY, Liao J, Yang CS: Inhibition of carcinogenesis by polyphenols: evidence from laboratory investigations. Am J Clin Nutr. 2005, 81 (1): 284s-291s.
Chung FL, Xu Y: Inhibition of Tobacco-Specific Nitrosamine Lung Tumorigenesis by Green Tea and Its Polyphenol as Antioxidants. Abstr Pap Am Chem S. 1992, 204: 129-AGFD.
Yang GY, Liu ZJ, Seril DN, Liao J, Ding W, Kim SB, Bondoc F, Yang CS: Black tea constituents, theaflavins, inhibit 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK)-induced lung tumorigenesis in A/J mice. Carcinogenesis. 1997, 18 (12): 2361-2365. 10.1093/carcin/18.12.2361.
Ju J, Hong J, Zhou JN, Pan Z, Bose M, Liao J, Yang GY, Lin YY, Hou Z, Lin Y, Ma JJ, Shih WJ, Carothers AM, Yang CS: Inhibition of intestinal tumorigenesis in Apc(min/+) mice by (-)-epigallocatechin-3-gallate, the major catechin in green tea. Cancer Res. 2005, 65 (22): 10623-10631. 10.1158/0008-5472.CAN-05-1949.
Peters U, Poole C, Arab L: Does tea affect cardiovascular disease? A meta-analysis. Am J Epidemiol. 2001, 154 (6): 495-503. 10.1093/aje/154.6.495.
Tanigawa T, Kanazawa S, Ichibori R, Fujiwara T, Magome T, Shingaki K, Miyata S, Hata Y, Tomita K, Matsuda K, Kubo T, Tohyama M, Yano K, Hosokawa K: (+)-Catechin protects dermal fibroblasts against oxidative stress-induced apoptosis. Bmc Complem Altern M. 2014, 14 (1): 133-10.1186/1472-6882-14-133.
Zhang XG, Xu P, Liu Q, Yu CH, Zhang Y, Chen SH, Li YM: Effect of tea polyphenol on cytokine gene expression in rats with alcoholic liver disease. Hepatobiliary & pancreatic diseases international: HBPD INT. 2006, 5 (2): 268-272.
Cavet ME, Harrington KL, Vollmer TR, Ward KW, Zhang JZ: Anti-inflammatory and anti-oxidative effects of the green tea polyphenol epigallocatechin gallate in human corneal epithelial cells. Mol Vis. 2011, 17 (60-61): 533-542.
Mamati GE, Liang YR, Lu JL: Expression of basic genes involved in tea polyphenol synthesis in relation to accumulation of catechins and total tea polyphenols. J Sci Food Agr. 2006, 86 (3): 459-464. 10.1002/jsfa.2368.
Lin YL, Juan IM, Chen YL, Liang YC, Lin JK: Composition of polyphenols in fresh tea leaves and associations of their oxygen-radical-absorbing capacity with antiproliferative actions in fibroblast cells. J Agr Food Chem. 1996, 44 (6): 1387-1394. 10.1021/jf950652k.
Singh HP, Ravindranath SD, Singh C: Analysis of tea shoot catechins: Spectrophotometric quantitation and selective visualization on two-dimensional paper chromatograms using diazotized sulfanilamide. J Agr Food Chem. 1999, 47 (3): 1041-1045. 10.1021/jf9807263.
Das NP, Griffiths LA: Studies on flavonoid metabolism. Biosynthesis of (+)-[14C]catechin by the plant Uncaria gambir Roxb. The Biochemical journal. 1967, 105 (1): 73-77.
Treutter D: Significance of flavonoids in plant resistance and enhancement of their biosynthesis. Plant Biol. 2005, 7 (6): 581-591. 10.1055/s-2005-873009.
Zheng XQ, Jin J, Chen H, Du YY, Ye JH, Lu JL, Lin C, Dong JJ, Sun QL, Wu LY, Liang YR: Effect of ultraviolet B irradiation on accumulation of catechins in tea (Camellia sinensis (L) O. Kuntze). Afr J Biotechnol. 2008, 7 (18): 3283-3287.
Punyasiri PAN, Abeysinghe ISB, Kumar V, Treutter D, Duy D, Gosch C, Martens S, Forkmann G, Fischer TC: Flavonoid biosynthesis in the tea plant Camellia sinensis: properties of enzymes of the prominent epicatechin and catechin pathways. Arch Biochem Biophys. 2004, 431 (1): 22-30. 10.1016/j.abb.2004.08.003.
Rani A, Singh K, Ahuja PS, Kumar S: Molecular regulation of catechins biosynthesis in tea [Camellia sinensis (L.) O. Kuntze]. Gene. 2012, 495 (2): 205-210. 10.1016/j.gene.2011.12.029.
Ashihara H, Deng WW, Mullen W, Crozier A: Distribution and biosynthesis of flavan-3-ols in Camellia sinensis seedlings and expression of genes encoding biosynthetic enzymes. Phytochemistry. 2010, 71 (5-6): 559-566. 10.1016/j.phytochem.2010.01.010.
Liu YJ, Gao LP, Liu L, Yang Q, Lu ZW, Nie ZY, Wang YS, Xia T: Purification and Characterization of a Novel Galloyltransferase Involved in Catechin Galloylation in the Tea Plant (Camellia sinensis). J Biol Chem. 2012, 287 (53): 44406-44417. 10.1074/jbc.M112.403071.
Tanaka J, Taniguchi F, Hirai N, Yamaguchi S: Estimation of the genome size of tea (Camellia sinensis), camellia (C. japonica), and their interspecific hybrids by flow cytometry. J Remote Sens Soc Jpn. 2006, 101: 1-7.
Shi CY, Yang H, Wei CL, Yu O, Zhang ZZ, Jiang CJ, Sun J, Li YY, Chen Q, Xia T, Wan XC: Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds. BMC Genomics. 2011, 12 (1): 131-10.1186/1471-2164-12-131.
Kaul S, Koo HL, Jenkins J, Rizzo M, Rooney T, Tallon LJ, Feldblyum T, Nierman W, Benito MI, Lin XY, Town CD, Venter JC, Fraser CM, Tabata S, Nakamura Y, Kaneko T, Sato S, Asamizu E, Kato T, Kotani H, Sasamoto S, Ecker JR, Theologis A, Federspiel NA, Palm CJ, Osborne BI, Shinn P, Conway AB, Vysotskaia VS, Dewar K, et al: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408 (6814): 796-815. 10.1038/35048692.
Goff SA, Ricke D, Lan TH, Presting G, Wang RL, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchinson D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong JP, Miguel T, Paszkowski U, Zhang SP, Colbert M, Sun WL, Chen LL, Cooper B, Park S, Wood TC, Mao L, Quail P, et al: A draft sequence of the rice genome (Oryza sativa L. ssp japonica). Science. 2002, 296 (5565): 92-100. 10.1126/science.1068275.
Yu J, Hu SN, Wang J, Wong GKS, Li SG, Liu B, Deng YJ, Dai L, Zhou Y, Zhang XQ, Cao ML, Liu J, Sun JD, Tang JB, Chen YJ, Huang XB, Lin W, Ye C, Tong W, Cong LJ, Geng JN, Han YJ, Li L, Li W, Hu GQ, Huang XG, Li WJ, Li J, Liu ZW, Li L, et al: A draft sequence of the rice genome (Oryza sativa L. ssp indica). Science. 2002, 296 (5565): 79-92. 10.1126/science.1068037.
Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, FitzGerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Dematte L, Mraz A, et al: A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety. Plos One. 2007, 2 (12): e1326-10.1371/journal.pone.0001326.
Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyere C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, et al: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007, 449 (7161): 463-467. 10.1038/nature06148.
Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen GL, Cooper D, Coutinho PM, Couturier J, Covert S, Cronk Q, et al: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313 (5793): 1596-1604. 10.1126/science.1128691.
Balentine DA, Wiseman SA, Bouwens LC: The chemistry of tea flavonoids. Critical Reviews in Food Science & Nutrition. 1997, 37 (8): 693-704. 10.1080/10408399709527797.
Song WO, Chun OK: Tea is the major source of flavan-3-ol and flavonol in the US diet. J Nutr. 2008, 138 (8): 1543-1547.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng QD, Chen ZH, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29 (7): 644-652. 10.1038/nbt.1883.
Kanehisa M, Goto S: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000, 28 (1): 27-30. 10.1093/nar/28.1.27.
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001, 29 (1): 22-28. 10.1093/nar/29.1.22.
O’Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R: High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinform. 2002, 3 (3): 275-284. 10.1093/bib/3.3.275.
Liu Y, Yang SX, Ji PZ, Gao LZ: Phylogeography of Camellia taliensis (Theaceae) inferred from chloroplast and nuclear DNA: insights into evolutionary history and conservation. BMC Evol Biol. 2012, 12 (1): 92-10.1186/1471-2148-12-92.
Zhao DW, Yang JB, Yang SX, Kato K, Luo JP: Genetic diversity and domestication origin of tea plant Camellia taliensis (Theaceae) as revealed by microsatellite markers. Bmc Plant Biol. 2014, 14 (1): 14-10.1186/1471-2229-14-14.
Magoma GN, Wachira FN, Obanda M, Imbuga M, Agong SG: The use of catechins as biochemical markers in diversity studies of tea (Camellia sinensis). Genet Resour Crop Ev. 2000, 47 (2): 107-114. 10.1023/A:1008772902917.
Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10 (1): 57-63. 10.1038/nrg2484.
Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L: Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013, 31 (1): 46-53. 10.1038/nbt.2450.
Li M-Y, Wang F, Jiang Q, Ma J, Xiong A-S: Identification of SSRs and differentially expressed genes in two cultivars of celery (Apium graveolens L.) by deep transcriptome sequencing. Horticulture Research. 2014, 1 (10): 1-9.
Wu HL, Chen D, Li JX, Yu B, Qiao XY, Huang HL, He YM: De Novo Characterization of Leaf Transcriptome Using 454 Sequencing and Development of EST-SSR Markers in Tea (Camellia sinensis). Plant Mol Biol Rep. 2013, 31 (3): 524-538. 10.1007/s11105-012-0519-2.
Wang XC, Zhao QY, Ma CL, Zhang ZH, Cao HL, Kong YM, Yue C, Hao XY, Chen L, Ma JQ, Jin JQ, Li X, Yang YJ: Global transcriptome profiles of Camellia sinensis during cold acclimation. BMC Genomics. 2013, 14 (1): 415-10.1186/1471-2164-14-415.
Hannah MA, Heyer AG, Hincha DK: A global survey of gene regulation during cold acclimation in Arabidopsis thaliana. Plos Genet. 2005, 1 (2): 179-196. 10.1371/journal.pgen.0010026.
Wu D, Austin RS, Zhou SJ, Brown D: The root transcriptome for North American ginseng assembled and profiled across seasonal development. BMC Genomics. 2013, 14 (1): 564-10.1186/1471-2164-14-564.
Hao DC, Ge GB, Xiao PG, Zhang YY, Yang L: The First Insight into the Tissue Specific Taxus Transcriptome via Illumina Second Generation Sequencing. Plos One. 2011, 6 (6): e21220-10.1371/journal.pone.0021220.
Wang Y, Xu L, Chen YL, Shen H, Gong YQ, Limera C, Liu LW: Transcriptome Profiling of Radish (Raphanus sativus L.) Root and Identification of Genes Involved in Response to Lead (Pb) Stress with Next Generation Sequencing. Plos One. 2013, 8 (6): e66539-10.1371/journal.pone.0066539.
Logacheva MD, Kasianov AS, Vinogradov DV, Samigullin TH, Gelfand MS, Makeev VJ, Penin AA: De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum). BMC Genomics. 2011, 12 (1): 30-10.1186/1471-2164-12-30.
Singh K, Kumar S, Rani A, Gulati A, Ahuja P: Phenylalanine ammonia-lyase (PAL) and cinnamate 4-hydroxylase (C4H) and catechins (flavan-3-ols) accumulation in tea. Funct Integr Genomic. 2009, 9 (1): 125-134. 10.1007/s10142-008-0092-9.
Singh K, Rani A, Kumar S, Sood P, Mahajan M, Yadav SK, Singh B, Ahuja PS: An early gene of the flavonoid pathway, flavanone 3-hydroxylase, exhibits a positive relationship with the concentration of catechins in tea (Camellia sinensis). Tree Physiol. 2008, 28 (9): 1349-1356. 10.1093/treephys/28.9.1349.
Singh K, Rani A, Paul A, Dutt S, Joshi R, Gulati A, Ahuja PS, Kumar S: Differential display mediated cloning of anthocyanidin reductase gene from tea (Camellia sinensis) and its relationship with the concentration of epicatechins. Tree Physiol. 2009, 29 (6): 837-846. 10.1093/treephys/tpp022.
Gagne S, Lacampagne S, Claisse O, Geny L: Leucoanthocyanidin reductase and anthocyanidin reductase gene expression and activity in flowers, young berries and skins of Vitis vinifera L. cv. Cabernet-Sauvignon during development. Plant Physiol Bioch. 2009, 47 (4): 282-290. 10.1016/j.plaphy.2008.12.004.
Tanner GJ, Francki KT, Abrahams S, Watson JM, Larkin PJ, Ashton AR: Proanthocyanidin biosynthesis in plants - Purification of legume leucoanthocyanidin reductase and molecular cloning of its cDNA. J Biol Chem. 2003, 278 (34): 31647-31656. 10.1074/jbc.M302783200.
Pang YZ, Peel GJ, Wright E, Wang ZY, Dixon RA: Early steps in proanthocyanidin biosynthesis in the model legume Medicago truncatula. Plant Physiol. 2007, 145 (3): 601-615. 10.1104/pp.107.107326.
Saito K, Yonekura-Sakakibara K, Nakabayashi R, Higashi Y, Yamazaki M, Tohge T, Fernie AR: The flavonoid biosynthetic pathway in Arabidopsis: Structural and genetic diversity. Plant Physiol Bioch. 2013, 72: 21-34. 10.1016/j.plaphy.2013.02.001.
Yamamoto T, Juneja LR, Kim M: General chemical composition of green tea and its infusion. Chemistry and applications of green tea. CRC press, Boca Raton, FL, USA; 1997:13-15.
Liang YR, Lu JL, Zhang LY, Wu S, Wu Y: Estimation of black tea quality by analysis of chemical composition and colour difference of tea infusions. Food Chem. 2003, 80 (2): 283-290. 10.1016/S0308-8146(02)00415-6.
Wei K, Wang LY, Zhou JA, He W, Zeng JM, Jiang YW, Cheng H: Catechin contents in tea (Camellia sinensis) as affected by cultivar and environment and their relation to chlorophyll contents. Food Chem. 2011, 125 (1): 44-48. 10.1016/j.foodchem.2010.08.029.
Gulati A, Rajkumar S, Karthigeyan S, Sud RK, Vijayan D, Thomas J, Rajkumar R, Das SC, Tamuly P, Hazarika M, Ahuja PS: Catechin and Catechin Fractions as Biochemical Markers to Study the Diversity of Indian Tea (Camellia sinensis (L.) O. KUNTZE) Germplasm. Chem Biodivers. 2009, 6 (7): 1042-1052. 10.1002/cbdv.200800122.
Pertea G, Huang XQ, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics. 2003, 19 (5): 651-652. 10.1093/bioinformatics/btg034.
Iseli C, Jongeneel CV, Bucher P: ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell Syst Mol Biol. 1999, 99: 138-148.
Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang J, Li S, Li R, Bolund L: WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006, 34 (Web Server issue): W293-W297. 10.1093/nar/gkl031.
Pfaffl MW: A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 2001, 29 (9): e45-10.1093/nar/29.9.e45.
The research was supported by the National Natural Science Foundation of China (31200520), Jiangsu Natural Science Foundation (BK2012774), Ph.D Programs Foundation of Ministry of Education of China (20120097120031), China and Jiangsu Postdoctoral Science Foundation (2013 M541686, 1401050B).
The authors declare that they have no competing interests.
Conceived and designed the experiments: JZ ZJW. Performed the experiments: ZJW XHL ZWL ZSX JZ. Analyzed the data: ZJW JZ. Contributed reagents/materials/analysis tools: JZ. Wrote the paper: ZJW. Revised the paper: JZ ZJW. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: C. sinensis transcriptome.(XLSX 135 KB)
Additional file 2: List of annotation information of differentially expressed genes among four tea plant cultivars transcriptomes.(XLSX 104 KB)
Additional file 3: C. sinensis.(XLSX 60 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Wu, ZJ., Li, XH., Liu, ZW. et al. De novo assembly and transcriptome characterization: novel insights into catechins biosynthesis in Camellia sinensis. BMC Plant Biol 14, 277 (2014). https://0-doi-org.brum.beds.ac.uk/10.1186/s12870-014-0277-4
- Camellia sinensis
- High-throughput sequencing
- Genetic diversity