Breaking News

Utility of the dataset

The dataset we assembled can be used in several ways. First, it provides a rich source of candidate genes for further in depth study. Researchers interested in a particular developmental process, for example morphogenesis of salivary gland, can simply search our annotations and retrieve an all-inclusive list of genes that were annotated as having expression in that structure. Such a gene set can be further subdivided by manual curation by experts in the particular field, using our primary image data. Second, the clustering classification presented in this report allows one to address more abstract questions about the dataset such as: Which genes are expressed in a regulated manner at cellular blastoderm stage? Which genes are involved in organogenesis during late embryogenesis? Finally, the dataset represents a starting point for a systematic analysis of the sequence determinants of gene expression patterns. Clustering provides gene groupings based on spatio-temporal gene expression specificity ranging from unique patterns, through small tightly co-regulated gene sets, to large gene expression classes. These data groupings can be tested against cis-regulatory prediction pipelines to identify significant associations between gene expression specificity and genomic sequence motifs.
Determining expression patterns is only a first step towards further understanding gene function and therefore it is important to intersect our spatial expression data with other genomic datasets. The tools that we developed allow anyone with an independent result that can be summarized as a list of genes, for example a cluster derived from targeted microarray analysis, to quickly determine the spatio-temporal expression patterns of these genes in the Drosophila embryo.
To address the difficulty of summarizing the gene expression patterns of a group of genes, we developed a new visual aide – the anatogram. Anatograms show the “position” of a given gene set in the complex space of spatio-temporal gene expression patterns and represent a convenient way to summarize such data for groups of genes. Anatograms also provide an intuitive comparison of differences among groups of genes, which can supplement more rigorous statistical comparisons. In principle, any list of genes can serve to generate an anatogram; for example, the list of Drosophila genes homologus to a gene group in another organism, or the genes that contain a particular sequence motif. In this way, anatograms can be used to compare results from gene expression studies among different species. The color code is based on organ systems shared by metazoan organisms and can be adapted to spatio-temporal gene expression data in other animals providing a organism-independent way to express this newly emerging class of gene expression data.

No comments