PHENOPSIS DB: an Information System for Arabidopsis thalianaphenotypic data in an environmental context
© Fabre et al; licensee BioMed Central Ltd. 2011
Received: 26 October 2010
Accepted: 9 May 2011
Published: 9 May 2011
Renewed interest in plant × environment interactions has risen in the post-genomic era. In this context, high-throughput phenotyping platforms have been developed to create reproducible environmental scenarios in which the phenotypic responses of multiple genotypes can be analysed in a reproducible way. These platforms benefit hugely from the development of suitable databases for storage, sharing and analysis of the large amount of data collected. In the model plant Arabidopsis thaliana, most databases available to the scientific community contain data related to genetic and molecular biology and are characterised by an inadequacy in the description of plant developmental stages and experimental metadata such as environmental conditions. Our goal was to develop a comprehensive information system for sharing of the data collected in PHENOPSIS, an automated platform for Arabidopsis thaliana phenotyping, with the scientific community.
PHENOPSIS DB is a publicly available (URL: http://bioweb.supagro.inra.fr/phenopsis/) information system developed for storage, browsing and sharing of online data generated by the PHENOPSIS platform and offline data collected by experimenters and experimental metadata. It provides modules coupled to a Web interface for (i) the visualisation of environmental data of an experiment, (ii) the visualisation and statistical analysis of phenotypic data, and (iii) the analysis of Arabidopsis thaliana plant images.
Firstly, data stored in the PHENOPSIS DB are of interest to the Arabidopsis thaliana community, particularly in allowing phenotypic meta-analyses directly linked to environmental conditions on which publications are still scarce. Secondly, data or image analysis modules can be downloaded from the Web interface for direct usage or as the basis for modifications according to new requirements. Finally, the structure of PHENOPSIS DB provides a useful template for the development of other similar databases related to genotype × environment interactions.
Arabidopsis thaliana, a small flowering plant with a rapid life cycle, offers important advantages for researches in genetics and molecular biology. Since 2000, the complete sequencing of its genome has enabled scientists to monitor gene expression on a genome-scale  in different organs and in different environmental conditions [e.g. [2, 3]]. The broad-based knowledge of this plant includes extensive genetic maps of all five chromosomes, efficient technology for mutagenesis and transformation and a large range of biological resources available at the various Arabidopsis stock centers (Arabidopsis Biological Resource Center, Nottingham Arabidopsis Stock Center, Riken Bioresource Center, INRA-Versailles Genomic Resource Center and Lehle Seeds, a private company). Many structured databases and querying tools have been developed providing repositories of large datasets and efficient applications for the determination of gene function (TAIR , NASC Proteomics , etc). While these databases provide extensive and robust genetic or molecular information, metadata like the precise characterisation of environmental conditions or plant developmental phenotypes are generally poorly documented. This point has recently received attention and several guidelines have been proposed acknowledging the importance of comprehensive metadata, and thus allowing cross-validation of experiments and meta-analysis procedures [6–10].
Unravelling gene function by large scale mutant screening has been mainly based on the mean value of a phenotypic effect measured under a given lab condition. It is often assumed in this approach that phenotypic variation among plants is largely due to genotypic variation. However, the validity of this assumption was questioned by a recent study in which three genotypes of Arabidopsis thaliana were grown in 10 laboratories using the same standardised conditions . Despite the use of a common, highly detailed protocol, the 10 labs still obtained phenotypic variation within genotypes for molecular and leaf developmental traits. The results showed that even small differences in environmental conditions or plant handling substantially affected growth at different levels . This study clearly demonstrates the need for precise recording of environmental conditions and reproducible characterisation of phenotypic traits in order to enable data sharing and comparison across laboratories. While automated phenotyping platforms are developed in many groups to obtain precise records of plant environmental conditions and growth phenotypes (Traitmill , PHENOSCOPE , WIWAM ), these data are still not available through repository databases. One of the pioneer platforms for reproducible phenotyping of Arabidopsis thaliana was the PHENOPSIS platform developed in our group in 2003 . In three highly controlled growth chambers, plants are subjected to different temperatures, day-lengths and drought treatments with an automatic recording of all environmental data. In platforms such as this, large quantities of environmental data, plant images and phenotypic data are produced for the study of genotype × environment effects on different plant processes. Procedures need to be conceived for a proper handling of these datasets, their efficient extraction and sharing with the scientific community. Here, we describe the content and utility of PHENOPSIS DB, an information system for the storage (database), analysis and sharing (Web interface, Web Services) of images and data collected in the PHENOPSIS platform.
Construction and content
PHENOPSIS DB contains phenotypic data and experimental and environmental metadata (see additional file 1: Description of the variables stored in PHENOPSIS DB). The phenotypic data include online (i.e. automatically recorded) and offline (i.e. manually recorded) plant images and sets of offline phenotypic measurements. Metadata consist of protocols, descriptions of variables, genotype characteristics and online environmental data.
Experiment protocols and variable descriptions
Each experiment is associated with a protocol that gives information about the experimental context. Other protocols describe how variables were obtained to ensure that all experimenters use the same methods to measure a given variable.
Arabidopsis thaliana genotypes may include ecotypes, inbred lines from specific crosses, mutants, etc. and information on the specific features of the genotype and the source of the material, i.e. the laboratory or stock center providing the seeds.
Climatic conditions (air temperature, air humidity, light intensity, vapor pressure deficit) in the PHENOPSIS growth chambers are continuously recorded during an experiment  and automatically sent to the server. R  functions check and insert them into the database. Plant watering data, i.e. the weight of individual pots before and after watering and the supplied amount of nutrient solution , are also automatically recorded and inserted into the database via real-time automated SQL requests.
Phenotypic data measured on plants
Non-invasive measurements, such as rosette and individual leaf area determination, plant growth stage records and transpiration measurements are performed during a growth run within PHENOPSIS. Invasive measurements, on the other hand, require the harvest of plants or plant parts and are performed at predefined dates (x days after sowing) or at given plant developmental stages. Examples are the determination of plant and organ fresh and dry weight, leaf thickness, leaf epidermal cell density and stomatal density. Both invasive and non-invasive measurements are inserted into the database via the Web interface. R functions are used to check data consistency before insertion.
Currently, 70 experiments are stored in the database and 15 of them are publicly available. They include 87000 phenotypic measurements on 865 genotypes, of which 50000 measurements on 620 genotypes are publicly available. 600000 images are stored in the database and more than 90000 are publicly available.
PHENOPSIS DB information system
The database was developed using the MySQL 5.0 Community Server and is composed of 15 physical tables (see additional file 2: Description of the physical data model of the PHENOPSIS DB database).
The Web interface
All metadata are freely available without restriction or authentication request. Metadata include: characteristics of experiments and associated protocols, list of genotypes grown in an experiment, list of variables measured in an experiment with their definition and associated protocols, comments on the experiments, micrometeorological data and plant watering data.
Images and phenotypic data from public experiments and public genotypes are also freely available without restriction or authentication request. The whole dataset associated with an experiment and/or a genotype becomes public as soon as the data have been published.
The access to images and phenotypic data from non-published experiments or confidential genotypes requires a user authentication that can be requested from the administrator in charge of the information system.
Web Services were developed to enhance interoperability and data exchanges with other systems (information systems, stand-alone programs). The PHENOPSIS DB Web Services are based on the Tomcat/Axis solution, described using WSDL language and they apply the SOAP protocol. They were developed in the Java language.
Utility and discussion
PHENOPSIS DB Web interface
A user-friendly Web interface
Centralised information systems are often developed for data storage when datasets are too extensive for personal computers. They are also used to promote exchanges between researchers and to perform meta-analyses, requiring high traceability and reproducibility of datasets. This can only be ensured through comprehensive metadata, data collection protocols and data descriptions. The PHENOPSIS DB interface has been developed for a large scientific community and allows the browsing, downloading, visualisation and analysis of all data recorded in the PHENOPSIS platform. The PHENOPSIS platform and the information system structure are documented on the Web interface (see http://bioweb.supagro.inra.fr/phenopsis/Accueil.php?lang=En). In the Data Browsing and Download section, basic or advanced searches can be performed depending on the user's familiarity with the system.
Interoperability between PHENOPSIS DB and other databases
Both the use of standards and the integration of ontologies enhance the interoperability between PHENOPSIS DB and other biological databases. The genotype nomenclature is based on the TAIR international nomenclature [21, 22] and hyperlinks lead to their description on the TAIR or NASC websites. The characterisation of growth stages follows the standard nomenclature described in . Whenever possible, measured organs are characterised according to the plant structure proposed in Plant Ontology . In addition, correspondence between plant growth variables and the ontologies of phenotypic traits were made. Some matches to variables were identified as terms in Trait Ontology , while for others it was necessary to combine different ontologies (Phenotype, Attribute and Trait Ontology , Plant Ontology, etc) following the EQV (Entity Qualifier Value) model . Variables not clearly identified in existing ontologies were defined as precisely as possible and will be submitted to ontology consortiums.
Consultation of the experiments and/or genotypes
The Experiments subsection within the Data Browsing and Download section allows searches on experiments associated with a publication, given genotypes or a specific type of stress (see http://bioweb.supagro.inra.fr/phenopsis/ConsulterManip.php, e.g. select experiments without any environmental stress). In the advanced search, users can select additional filters such as measured variables, environmental conditions, etc. Each experiment is associated with a description that provides its general features, the genotypes studied and the variables measured, the characteristics of each pot (sowing date, weights for soil humidity calculation, etc), and the parameters for setting environmental conditions.
Download and analysis of phenotypic data
Users of the system can download the publicly available datasets in the Data Browsing and Download > Data measured on plants section (see http://bioweb.supagro.inra.fr/phenopsis/ConsulterMesurePlante.php), using similar searching criteria to those described above to restrict the downloading to specific data of interest.
Download and visualisation of environmental conditions during an experiment
Environmental data, including micrometeorological and plant watering data, can be consulted and downloaded in the Data Browsing and Download section. Two modules have been developed in the Graphs and Descriptive Statistics section to check the consistency between set and obtained environmental conditions and to assist in the precise monitoring of experiments. In the first module, micrometeorological data and a basic statistical analysis can be visualised and downloaded in graphs. More specifically, the module displays the kinetics of the different meteorological data over an experiment together with a statistical summary (see http://bioweb.supagro.inra.fr/phenopsis/StatMeteo.php). In the second module, the soil water content in pots can be visualised and downloaded in graphs together with a basic statistical analysis (see http://bioweb.supagro.inra.fr/phenopsis/StatIrrigation.php). One application within the module displays the changes in soil humidity over an experiment for individual pots  with a statistical summary. A second application produces graphs showing the soil water content of all pots in a PHENOPSIS growth chamber before and after watering at a given date and for each plant watering cycle.
Download and analysis of images
Users of the system can download the publicly available images in the Data Browsing and Download > Plant images section (see http://bioweb.supagro.inra.fr/phenopsis/ConsulterImages.php) and can restrict the downloading by applying filters. Plant images can be previewed, downloaded in ZIP files and used in the estimation of additional variables by applying other image analysis algorithms. For example, scans that have been used for the measurement of individual area of successive leaves on a rosette can be re-analysed to estimate shape parameters of the same leaves; similarly, leaf sections that have been used in the estimation of leaf thickness can be used in the measurement of vein diameter.
The Image Analyses and ImageJ Macros section provides tools for the analysis of large sets of plant images in an automatic or semi-automatic way using ImageJ macros (see http://bioweb.supagro.inra.fr/phenopsis/MacroImageJ.php). These macros can be downloaded and run as a stand-alone application for the analysis of (i) batches of rosette images to measure the projected rosette area of individual plants and (ii) leaf scans to measure individual leaf areas.
PHENOPSIS DB Web Services
Our Web Services implement several methods. Currently, in the main methods one can get the list and description of (i) the public genotypes studied in all experiments or in a specific experiment, (ii) the measured phenotypic variables or (iii) the different types of images collected. Additionally, it is possible to get the sequence of visible images taken automatically in the growth chambers for plants of a specific genotype grown in a specific experiment. Using this last method one can for example automatically generate animated images of individual plant growth. Some examples of client applications available in different languages (Python, PHP) can be downloaded from the Web interface.
The Web services are described at http://bioweb.supagro.inra.fr/phenopsis/WebService.php and available to client programs via the WSDL document http://bioweb.supagro.inra.fr/phenopsis/wsdl.
Examples of applications
The utility of PHENOPSIS DB for the analysis of large datasets has been demonstrated in recent studies. In a first example, the multi-scale analysis of leaf growth in 120 genotypes allowed the identification of robust emergent properties in the sub-cellular control of leaf development . Secondly, the comparison of the leaf growth response of the same 120 genotypes, grown in limited soil water content, allowed the detection of genotypes that maintained leaf growth under drought .
Examples of extensions
The whole system is flexible and easily upgradable to host new environmental or phenotypic variables and new types of images resulting from the evolution of research projects or the development of new protocols. For example, the creation of new environmental variables associated with mineral and abiotic stresses in soil is in progress. In addition, the development of a recent protocol for the 3D characterisation of leaf growth at the cellular level  has required the creation of new phenotypic variables. Finally, as the platform is also used in the production of highly characterised leaf material for molecular, biochemical or mineral content analyses, variables will be extended to metabolites contents, enzyme activities, transcript profiling, etc [11, 30].
PHENOPSIS DB provides the storage of millions of data and hundreds of Gb of images generated yearly in the PHENOPSIS platform. The information system contains useful resources for the scientific community working on genotype × environment interactions in Arabidopsis thaliana. Moreover, its structure serves as a template for other groups developing similar systems.
Availability and requirements
PHENOPSIS DB is an open access database: http://bioweb.supagro.inra.fr/phenopsis/
It is referenced by APP (French Agency for Program Protection) under the INRA name and with number IDDN.FR.001.160017.000.R.P.2010.000.40000.
Metadata, images and phenotypic data from public experiments and public genotypes can be downloaded for further analyses. However, all analyses or figures produced using data accessed via PHENOPSIS DB must include a clear indication of sources such as: "This analysis is based upon data provided by PHENOPSIS DB", with citation of this paper. In the case of private data the acknowledgement must also include a statement such as "Permission to use these data was granted by <name, title and affiliation>".
Our group will service PHENOPSIS DB continuously and update it on a regular basis. Questions, comments and requests regarding this database should be sent to Vincent Negre at firstname.lastname@example.org.
We would like to thank Virginie Rossard for sharing her know-how on database management and MySQL. We thank Optimalog  for the development of the PHENOPSIS platform and the automatic transfer of data. We are grateful to all users that helped us to improve and make evolve the PHENOPSIS DB. We also thank the informatics team for technical and server support. Finally, we thank Sean Walsh for correcting the manuscript and the Web interface texts. This work was supported by Agron-Omics, a European sixth framework integrated project (LSHG-CT-2006-037704).
- The Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y, Kamiya A, Nakajima M, Enju A, Sakurai T, Satou M, Akiyama K, Taji T, Yamaguchi-Shinozaki K, Carninci P, Kawai J, Hayashizaki Y, Shinozaki K: Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 2002, 31: 279-292. 10.1046/j.1365-313X.2002.01359.x.PubMedView ArticleGoogle Scholar
- Wang R, Okamoto M, Xing X, Crawford NW: Microarray analysis of the nitrate response in Arabidopsis roots and shoots reveals over 1,000 rapidly responding genes and new linkages to glucose, trehalose-6-phosphate, iron, and sulfate metabolism. Plant Physiol. 2003, 132: 556-567. 10.1104/pp.103.021253.PubMedPubMed CentralView ArticleGoogle Scholar
- The Arabidopsis Information Resource. [http://www.arabidopsis.org]
- Proteomics Database for Arabidopsis data. [http://proteomics.arabidopsis.info]
- Plant Ontology Consortium. [http://www.plantontology.org/]
- Ilic K, Kellogg EA, Jaiswal P, Zapata F, Stevens PF, Vincent LP, Avraham S, Reiser L, Pujar A, Sachs MM, Whitman NT, McCouch SR, Schaeffer ML, Ware DH, Stein LD, Rhee SY: The plant structure ontology, a unified vocabulary of anatomy and morphology of a flowering plant. Plant Physiol. 2007, 143: 587-599.PubMedPubMed CentralView ArticleGoogle Scholar
- Zimmermann P, Schildknecht B, Craigon D, Garcia-Hernandez M, Gruissem W, May S, Mukherjee G, Parkinson H, Rhee S, Wagner U, Hennig L: MIAME/Plant - adding value to plant microarrray experiments. Plant Methods. 2006, 2: 1-10.1186/1746-4811-2-1.PubMedPubMed CentralView ArticleGoogle Scholar
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M: Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001, 29: 365-371. 10.1038/ng1201-365.PubMedView ArticleGoogle Scholar
- MIAME. [http://www.mged.org/Workgroups/MIAME/miame.html]
- Massonnet C, Vile D, Fabre J, Hannah MA, Caldana C, Lisec J, Beemster GTS, Meyer RC, Messerli G, Gronlund JT, Perkovic J, Wigmore E, May S, Bevan MW, Meyer C, Rubio-Díaz S, Weigel D, Micol JL, Buchanan-Wollaston V, Fiorani F, Walsh S, Rinn R, Gruissem W, Hilson P, Hennig L, Willmitzer L, Granier C: Probing the reproducibility of leaf growth and molecular Phenotypes: A Comparison of Three Arabidopsis Accessions Cultivated in Ten Laboratories. Plant Physiol. 2010, 152: 2142-2157. 10.1104/pp.109.148338.PubMedPubMed CentralView ArticleGoogle Scholar
- Cropdesign: Traitmill - Platform and Process. [http://www.cropdesign.com/tech_traitmill.php]
- IJPB-Phénotypage haut débit chez Arabidopsis thaliana. [http://www-ijpb.versailles.inra.fr/en/ppa/ppa_accueil.htm]
- Systems biology of drought tolerance in Arabidopsis. [http://www.psb.ugent.be/yield-research/465-projects2]
- Granier C, Aguirrezabal L, Chenu K, Cookson SJ, Dauzat M, Hamard P, Thioux JJ, Rolland G, Bouchier-Combaud S, Lebaudy A, Muller B, Simonneau T, Tardieu F: PHENOPSIS, an automated platform for reproducible phenotyping of plant responses to soil water deficit in Arabidopsis thaliana permitted the identification of an accession with low sensitivity to soil water deficit. New Phytol. 2006, 16: 623-635.View ArticleGoogle Scholar
- R Development Core Team: R: A language and environment for statistical computing. 2009, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0, [http://www.r-project.org]Google Scholar
- ImageJ. [http://rsbweb.nih.gov/ij/]
- World Wide Web Consortium. [http://www.w3.org/]
- The W3C Markup Validation Service. [http://validator.w3.org/]
- W3C CSS Validation Service. [http://jigsaw.w3.org/css-validator/]
- TAIR Nomenclature Guidelines. [http://www.arabidopsis.org/portals/nomenclature/guidelines.jsp#genbank]
- Meinke D, Koornneef M: Community standards for Arabidopsis genetics. Plant J. 1997, 12: 247-253. 10.1046/j.1365-313X.1997.12020247.x.View ArticleGoogle Scholar
- Boyes DC, Zayed AM, Ascenzi R, McCaskill AJ, Hoffman NE, Davis KR, Görlach J: Growth stage-based phenotypic analysis of Arabidopsis: a model for high throughput functional genomics in plants. Plant Cell. 2001, 13: 1499-1510.PubMedPubMed CentralView ArticleGoogle Scholar
- Gramene. [http://www.gramene.org]
- PATO: Main Page - OBOFoundry. [http://obofoundry.org/wiki/index.php/PATO:Main_Page]
- Mungall CJ, Gkoutos GV, Smith CL, Haendel MA, Lewis SE, Ashburner M: Integrating phenotype ontologies across multiple species. Genome Biol. 2010, 11: R2-10.1186/gb-2010-11-1-r2.PubMedPubMed CentralView ArticleGoogle Scholar
- Tisné S, Reymond M, Vile D, Fabre J, Dauzat M, Koornneef M, Granier C: Combined genetic and modeling approaches reveal that epidermal cell area and number in leaves are controlled by leaf and plant developmental processes in Arabidopsis. Plant Physiol. 2008, 148: 1117-1127. 10.1104/pp.108.124271.PubMedPubMed CentralView ArticleGoogle Scholar
- Tisné S, Schmalenbach I, Reymond R, Dauzat M, Pervent M, Vile D, Granier C: Keep on growing under drought: genetic and developmental bases of the response of rosette area using a recombinant inbred line population. Plant Cell Environ. 2010, 33: 1875-1887. 10.1111/j.1365-3040.2010.02191.x.PubMedView ArticleGoogle Scholar
- Wuyts N, Palauqui JC, Conejero G, Verdeil JL, Granier C, Massonnet C: High-contrast three-dimensional imaging of the Arabidopsis leaf enables the analysis of cell dimensions in the epidermis and mesophyll. Plant Methods. 2010, 6: 17-10.1186/1746-4811-6-17.PubMedPubMed CentralView ArticleGoogle Scholar
- Ghandilyan A, Barboza L, Tisné S, Granier C, Reymond M, Koornneef M, Schat H, Aarts MGM: Genetic analysis identifies quantitative trait loci controlling rosette mineral concentrations in Arabidopsis thaliana under drought. New Phytol. 2009, 184: 180-192. 10.1111/j.1469-8137.2009.02953.x.PubMedView ArticleGoogle Scholar
- Optimalog. [http://www.optimalog.com/phenopsis.html]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.